Wednesday, 2014-10-01

*** achanda has quit IRC00:02
*** ChuckC has quit IRC00:09
*** rloo has quit IRC00:16
*** marcoemorais has quit IRC00:28
*** hemna__ has quit IRC00:35
*** hemna__ has joined #openstack-ironic00:40
*** ChuckC has joined #openstack-ironic00:43
*** yuanying has joined #openstack-ironic00:47
*** chuckC_ has joined #openstack-ironic00:51
*** yuanying has quit IRC00:55
*** r-daneel has quit IRC01:04
*** HenryG_ has joined #openstack-ironic01:22
*** HenryG has quit IRC01:24
*** hemna__ has quit IRC01:37
*** yuanying has joined #openstack-ironic01:52
*** nosnos has joined #openstack-ironic01:57
*** chuckC_ has quit IRC02:11
*** HenryG_ has quit IRC02:25
*** HenryG has joined #openstack-ironic02:41
*** harlowja is now known as harlowja_away02:42
*** ramineni has joined #openstack-ironic02:55
*** pcrews has quit IRC03:09
*** nosnos has quit IRC03:21
*** nosnos has joined #openstack-ironic03:22
*** rushiagr_away is now known as rushiagr03:26
*** nosnos has quit IRC03:26
*** nosnos has joined #openstack-ironic04:20
*** nikunj2512 has joined #openstack-ironic04:35
*** rushiagr is now known as rushiagr_away04:52
*** killer_prince is now known as lazy_prince05:00
*** rakesh_hs has joined #openstack-ironic05:09
*** smoriya has joined #openstack-ironic05:09
*** marcoemorais has joined #openstack-ironic05:14
*** marcoemorais1 has joined #openstack-ironic05:15
*** marcoemorais has quit IRC05:19
*** rushiagr_away is now known as rushiagr05:35
*** k4n0 has joined #openstack-ironic05:47
*** rushiagr is now known as rushiagr_away05:50
*** rushiagr_away is now known as rushiagr05:51
*** pensu has joined #openstack-ironic05:55
*** AJaeger has left #openstack-ironic05:58
*** dtantsur|afk is now known as dtantsur06:36
dtantsurMorning Ironic06:36
devanandamornin :)06:37
devanandai'm about to hack out some changes to the hash ring patches, then pass out06:37
dtantsurheh, unusual to see you online in the morning :)06:40
dtantsurok will review them today (though I can't really test them I guess)06:40
devanandadtantsur: what do you think of s/is_active_on/conductor_affinity/06:42
dtantsurdevananda, +1 to having word 'conductor' in it06:42
*** ifarkas has joined #openstack-ironic06:44
dtantsurdevananda, re 'affinity'... it sounds like some constant property, 'is_active_on' sounds like something transient06:45
devanandayep06:45
devanandai'm not unsetting it any more :)06:45
dtantsurah so I need to refresh my knowledge of your patches06:45
devanandarloo's comments convinced me, there's no need to unset it during tear down06:45
dtantsur(and also to wake up)06:45
devanandawell, i haven't posted them yet :)06:45
dtantsurI see :)06:46
devanandait merely indicates the last conductor to have done something to that node06:46
devanandawhich may or may not indicate that there is some persistent local state on taht conductor06:46
devanandabut what that really means is: if a node is mapped to the conductor which did NOT most recently maybe create some lcoal state, then it should try to do that06:47
devanandaif affinity != mapping: do take_over()06:48
GheRiverodevananda: pong06:49
devanandaGheRivero: 17:00:24 < devananda> also, running unit tests against mysql locally, I see only two INSERTs into one of our tables,a nd I dont see any ALTERs at all06:50
devanandaany thoughts on why I don't see ALTERs ?06:51
GheRiverono idea. Will check it now to see if I get the same behavior06:52
*** marcoemorais1 has quit IRC07:01
openstackgerritMark Atwood proposed a change to openstack/ironic: Cleans up some Sphinx rST warnings in Ironic  https://review.openstack.org/12527307:02
dtantsurdevananda, makes sense07:06
openstackgerritOpenStack Proposal Bot proposed a change to openstack/ironic: Updated from global requirements  https://review.openstack.org/12527407:11
GheRiverodevananda: indeed, there are only two inserts but i have multiple ALTERS07:17
devanandaGheRivero: can you paste one?07:18
GheRiveroALTER TABLE nodes ADD COLUMN provision_updated_at DATETIME07:19
GheRiverohttp://paste.openstack.org/show/117373/07:19
*** jcoufal has joined #openstack-ironic07:20
devanandaGheRivero: I see. what db backend are you using // how is it configured?07:20
GheRiveroyou just need a mysql server with the account openstack_citest :openstack_citest07:22
devanandathat's what I have07:22
devanandaand I see alembic creating tables and such, but no ALTERs07:22
GheRiveroI didn't try with pg07:26
devanandai'm using mysql07:26
GheRiveroweird that you get the insert but not the alter07:28
GheRiverothe first time i have some issues with the openstack_citest user permissions, but the tests were falling long way before doing anything07:29
*** Haomeng has joined #openstack-ironic07:35
*** Haomeng|2 has quit IRC07:36
*** Haomeng|2 has joined #openstack-ironic07:39
*** Haomeng has quit IRC07:40
*** coolsvap|afk is now known as coolsvap07:48
openstackgerritDevananda van der Veen proposed a change to openstack/ironic: Add "affinity" tracking to nodes and conductors  https://review.openstack.org/12449307:49
*** andreykurilin_ has joined #openstack-ironic07:51
*** MattMan has quit IRC07:56
*** MattMan has joined #openstack-ironic07:57
*** jistr has joined #openstack-ironic08:02
*** ndipanov_gone is now known as ndipanov08:04
*** lsmola has joined #openstack-ironic08:04
*** viktors|afk is now known as viktors08:08
*** yuanying has quit IRC08:10
*** yuanying has joined #openstack-ironic08:11
*** yuanying has quit IRC08:15
*** lucasagomes has joined #openstack-ironic08:18
openstackgerritDevananda van der Veen proposed a change to openstack/ironic: Add periodic task to rebuild conductor local state  https://review.openstack.org/12461008:20
openstackgerritDevananda van der Veen proposed a change to openstack/ironic: Make hash ring mapping be more consistent  https://review.openstack.org/11893208:21
* devananda sleeps08:22
*** coolsvap is now known as coolsvap|afk08:24
*** derekh has joined #openstack-ironic08:26
lucasagomesdevananda, g'night :)08:38
*** igordcard has joined #openstack-ironic08:41
vdrokmorning ironic!08:55
vdrokdevananda, will investigate db tests today08:56
*** andreykurilin_ has quit IRC09:00
*** yuanying has joined #openstack-ironic09:12
*** loki184 has joined #openstack-ironic09:16
*** rushiagr is now known as rushiagr_away09:31
*** dtantsur is now known as dtantsur|lunch09:38
*** wendar has quit IRC09:38
*** wendar has joined #openstack-ironic09:38
*** yuanying has quit IRC09:46
openstackgerritA change was merged to openstack/ironic: Updated from global requirements  https://review.openstack.org/12527410:02
*** tonycoffman has quit IRC10:17
*** sirushti has quit IRC10:17
*** sirushti has joined #openstack-ironic10:17
*** tonycoffman has joined #openstack-ironic10:18
openstackgerritAnusha Ramineni proposed a change to openstack/ironic: ilo* drivers to use only ilo credentials  https://review.openstack.org/12470410:19
openstackgerritJulien Danjou proposed a change to openstack/ironic: Replace custom lazy loading by stevedore  https://review.openstack.org/12509010:32
*** yuanying has joined #openstack-ironic10:42
*** ramineni has quit IRC11:00
*** loki184 has quit IRC11:02
*** nikunj2512 has quit IRC11:04
*** Haomeng has joined #openstack-ironic11:10
*** Haomeng|2 has quit IRC11:11
*** yuanying has quit IRC11:16
*** rakesh_hs has quit IRC11:25
openstackgerritA change was merged to openstack/ironic: Use DbTestCase as test base when context needed  https://review.openstack.org/12440011:29
*** GheRivero has quit IRC11:30
*** GheRivero has joined #openstack-ironic11:30
*** GheRivero has quit IRC11:30
*** lazy_prince is now known as killer_prince11:31
*** GheRivero has joined #openstack-ironic11:31
*** yuanying has joined #openstack-ironic11:42
*** dtantsur|lunch is now known as dtantsur11:46
dtantsurjroll, hi, what's the staus of agent jobs? devstack fails for them: http://logs.openstack.org/90/125090/2/check/check-tempest-dsvm-ironic-agent_ssh-nv/dfa94dd/logs/devstacklog.txt.gz11:46
*** dprince has joined #openstack-ironic11:55
*** lucasagomes is now known as lucas-hungry12:07
*** pensu has quit IRC12:07
*** yuanying has quit IRC12:15
openstackgerritA change was merged to openstack/ironic: Replace custom lazy loading by stevedore  https://review.openstack.org/12509012:20
Shrewss'up, ironic?12:25
*** igordcard has quit IRC12:27
dtantsurShrews, hi12:38
*** romcheg has quit IRC12:38
*** igordcard has joined #openstack-ironic12:39
*** romcheg has joined #openstack-ironic12:44
*** andreykurilin_ has joined #openstack-ironic12:49
*** pensu has joined #openstack-ironic12:54
*** shikui_ has quit IRC12:54
*** andreykurilin_ has quit IRC12:57
*** rloo has joined #openstack-ironic13:05
*** yuanying has joined #openstack-ironic13:12
jrollmorning ironig13:12
jrolldtantsur: it's a known issue, tl;dr is that ramdisk doesn't load in 512MB RAM13:13
jrolldtantsur: but setting that higher isn't viable long term :(13:13
dtantsurjroll, morning, understood13:13
*** lucas-hungry is now known as lucasagomes13:14
lucasagomesjroll, morning13:14
jrollmorning lucas13:14
*** pensu has quit IRC13:17
Shrewslucasagomes: i'm sure you saw, but making sure... moved your node object code from 124610 to 124493 and made you co-author on it.13:28
lucasagomesShrews, yup! saw it13:28
*** nosnos has quit IRC13:28
lucasagomesthanks for moving it and adding me as co-author13:28
lucasagomesI'm adding tests to the hash ring and addressing comments13:28
*** nosnos has joined #openstack-ironic13:29
lucasagomesthere's some other comments to address on 124493 too13:29
lucasagomesShrews, and morning :)13:31
Shrewsmorning lucasagomes, dtantsur, jroll, and anyone i missed13:31
*** pensu has joined #openstack-ironic13:32
jrollheya Shrews :)13:32
*** nosnos has quit IRC13:33
*** yuanying has quit IRC13:46
*** pelix has joined #openstack-ironic14:00
*** krtaylor has quit IRC14:01
*** dprince has quit IRC14:07
*** openstackgerrit has quit IRC14:18
*** openstackgerrit has joined #openstack-ironic14:19
*** ramineni has joined #openstack-ironic14:23
openstackgerritAnusha Ramineni proposed a change to openstack/ironic: ilo* drivers to use only ilo credentials  https://review.openstack.org/12470414:25
*** krtaylor has joined #openstack-ironic14:25
*** dprince has joined #openstack-ironic14:37
*** ramineni has quit IRC14:37
*** dhellmann has joined #openstack-ironic14:38
*** dhellmann has quit IRC14:38
openstackgerritVladyslav Drok proposed a change to openstack/ironic: Move database fixture to a separate test case  https://review.openstack.org/12536114:40
*** dhellmann has joined #openstack-ironic14:41
*** yuanying has joined #openstack-ironic14:42
*** pcrews has joined #openstack-ironic14:47
*** openstackgerrit has quit IRC14:47
*** openstackgerrit has joined #openstack-ironic14:49
NobodyCammorning ironic....almost14:50
dtantsurNobodyCam, hi! almost morning or almost ironic? :)14:53
romchegOr almost NobodyCam? :-P14:53
romchegHi guys!14:53
*** romcheg has left #openstack-ironic14:54
dtantsurrom<TAB> :(14:55
openstackgerritA change was merged to openstack/python-ironicclient: Add "ironic node-set-power-state" cmd unit test  https://review.openstack.org/12426714:56
*** dhellmann_ has quit IRC14:57
lucasagomesNobodyCam, morning15:00
openstackgerritLucas Alvares Gomes proposed a change to openstack/ironic: Add periodic task to rebuild conductor local state  https://review.openstack.org/12461015:00
lucasagomeswith tests now ^15:00
*** dhellmann_ has joined #openstack-ironic15:01
*** chuckC_ has joined #openstack-ironic15:02
*** openstackgerrit has quit IRC15:02
*** openstackgerrit has joined #openstack-ironic15:04
NobodyCamOkay Now G.M.I. (Good MOrning Ironic)15:05
*** dhellmann_ has quit IRC15:05
NobodyCammorning dtantsur, rocheg lucasagomes, Shrews and rloo15:05
Shrewsmorning NobodyCam15:06
*** dhellmann_ has joined #openstack-ironic15:06
rlooHi NobodyCam, dtantsur, lucasagomes, Shrews ;)15:06
NobodyCam:)15:07
dtantsurrloo, hi15:07
openstackgerritJim Rollenhagen proposed a change to openstack/ironic-python-agent: Use poll instead of threading.Event in heartbeat thread  https://review.openstack.org/11334315:09
jroll^ this was a fun one15:10
openstackgerritJim Rollenhagen proposed a change to openstack/ironic-python-agent: Use poll instead of threading.Event in heartbeat thread  https://review.openstack.org/11334315:11
lucasagomesrloo, morning!15:13
*** viktors is now known as viktors|afk15:13
*** yuanying has quit IRC15:15
*** mitz_ has quit IRC15:18
*** Poornima has joined #openstack-ironic15:18
NobodyCamanyone happen to know where I could find the ipmi protocol specification15:19
Shrewsjroll: you know that poll() call is blocking, right? unless you use eventlet to monkeypatch select lib somewhere15:19
lucasagomesNobodyCam, yup, 1 sec15:19
jrollShrews: yeah, it's fine, that's a separate thread15:20
lucasagomeshttp://www.intel.com/content/www/us/en/servers/ipmi/ipmi-specifications.html15:20
lucasagomesNobodyCam, ^15:20
jroll(I think that's fine)15:20
NobodyCamalso anyone have a ipmi system that htey could past the output of: ipmitool -H 127.0.0.1 —vvvv —I lan chassis power off15:20
jrollShrews: imbw, is that going to block from context switching?15:20
jrollNobodyCam: sure15:20
NobodyCamlucasagomes: Thank you :)15:21
Shrewsjroll: not thread context switching. just blocks your one thread15:22
jrollShrews: yeah, that's what I thought, that's fine15:22
jrollNobodyCam: though we've always used lanplus, not sure what -I lan does or if it will work15:23
* jroll tries15:23
Shrewsjroll: cool. i don't know that code well, so just mentioning it15:23
Shrewss/well/at all/15:23
*** mitz_ has joined #openstack-ironic15:23
*** Poornima has quit IRC15:25
*** dhellmann_ has quit IRC15:26
openstackgerritSam Betts proposed a change to openstack/ironic: Add a call to management.validate(task)  https://review.openstack.org/12538715:26
lucasagomesShrews, are u taking a look at 124493 comments ? if not I will put a patch up to address them15:27
Shrewslucasagomes: no, but i was thinking about dtantsur's comment about DBDuplicateEntry15:28
*** dhellmann_ has joined #openstack-ironic15:28
lucasagomesShrews, yeah :/ hmm did you find something? is that possible?15:28
Shrewslucasagomes: i'm not sure yet15:29
lucasagomesI haven't looked deep into it15:29
lucasagomesok15:29
jrollNobodyCam: PM'd15:29
jrollERROR (BadRequest): Multiple possible networks found, use a Network ID to be more specific. (HTTP 400) (Request-ID: req-d9987c01-0dcd-4a8f-89d0-a8557587c812)15:31
jrolljust got this in our normal devstack setup15:31
Shrewslucasagomes: i did run the patch through devstack and it seems to work15:32
jrollis that something new or did I break it?15:32
Shrewsjroll: new-ish15:32
jrollugh15:32
Shrewsuse --nic net-id=private, iirc15:32
Shrewsto nova boot15:32
jrollthanks15:32
jrollugh, has to be uuid15:33
*** dtantsur is now known as dtantsur|afk15:33
Shrewsoh, yeah  :(15:33
jrollnova network-list15:33
Shrewsjroll: NET=$(nova network-list | grep "private" | awk '{print $2}')15:33
jrollnice15:33
* jroll adds to docs15:33
jrollthanks15:33
*** dhellmann has quit IRC15:34
*** dhellmann_ is now known as dhellmann15:34
*** jasondotstar has joined #openstack-ironic15:34
lucasagomesShrews, w00t!15:34
lucasagomesdid you or deva tested the neutron bit yesterday?15:35
NobodyCamjroll: does inaplus work?15:35
Shrewslucasagomes: well, by work, i mean "conductor_affinity" gets set properly15:35
jrollNobodyCam: lanplus works for me, I have you output with just lan15:35
lucasagomesheh yeah15:35
NobodyCamjroll: could you paste both?15:35
jrollNobodyCam: I just closed everything :/15:35
Shrewslucasagomes: neutron bit?15:35
jrollNobodyCam: give me a few15:35
NobodyCamjroll: no need to re open15:36
lucasagomesShrews, yeah, on the take_over()15:36
NobodyCamthis is good15:36
lucasagomesI could test it because I was doing my tests in only 1 vm15:36
jrollNobodyCam: I can if it helps15:36
jrollok15:36
Shrewslucasagomes: oh, not yet (at least i haven't)15:36
lucasagomesso I got as far as checking that the images are cached and the configs are generated correctly... I see neutron being called but15:37
lucasagomesto test it we need another vm, 1 conductor per vm15:37
lucasagomes> 1 ir-cond in the same host is not supported15:37
Shrewslucasagomes: ah, hrm15:37
lucasagomesShrews, right, no bothers15:37
* Shrews wants our test environment setup with docker containers to test such things15:38
lucasagomesShrews, +1!!!!15:38
lucasagomesthat would be awesome15:39
Shrewsdox could eventually give us that... in time15:39
lucasagomesyush!15:39
*** igordcard has quit IRC15:42
*** mikedillion has joined #openstack-ironic15:42
*** andreykurilin_ has joined #openstack-ironic15:46
*** todd_dsm has joined #openstack-ironic15:49
openstackgerritSam Betts proposed a change to openstack/ironic: Add a call to management.validate(task)  https://review.openstack.org/12538715:56
lucasagomesShrews, once you unprovisioned ur node, did the conductor_affinity got unset?15:56
*** chuckC_ has quit IRC16:03
*** ifarkas has quit IRC16:04
*** k4n0 has quit IRC16:05
*** yuanying has joined #openstack-ironic16:12
NobodyCamgrrrrr....brb :-p16:14
*** dprince_ has joined #openstack-ironic16:15
*** annegentle_ has quit IRC16:17
*** dprince has quit IRC16:17
*** mordred has quit IRC16:17
*** harlowja_away has quit IRC16:17
*** annegentle has joined #openstack-ironic16:17
*** openstackgerrit has quit IRC16:18
*** annegentle is now known as Guest8478116:18
rloohi lucasagomes, wrt conductor_affinity (124493).16:18
*** mordred has joined #openstack-ironic16:18
rloolucasagomes: that value is meant to stick even after a deploy is done.16:18
rloolucasagomes: which is why it is never reset to None.16:18
lucasagomesrloo, so in the last review that deva put it up seems so16:18
lucasagomesbut I don't think it should16:19
lucasagomesI think it should be removed once the node is undeployed16:19
rloolucasagomes: yeah, my fault maybe. See his comments in patch 12.16:19
lucasagomesI left some comments there16:19
lucasagomeswill see16:19
*** openstackgerrit has joined #openstack-ironic16:19
rloolucasagomes: https://review.openstack.org/#/c/124493/12/ironic/conductor/manager.py16:20
NobodyCammore coffee needed16:21
lucasagomesrloo, right, yeah hmm16:22
lucasagomeswell I see it's not needed, but I think that leaving it in the db is a bit misleading16:22
rloolucasagomes: yeah. hmmm :-) I now wonder why we don't just set that when a node is registered/mapped to a conductor, regardless of whether a deploy was done or not.16:22
lucasagomescause that will only get updated if the node *is deployed* when running the periodic task16:22
rloolucasagomes: well, it isn't misleading. it depends on what you think the definition of that field is.16:22
lucasagomesrloo, well that the last conductor that have managed that node16:23
rloolucasagomes: if the def is 'the last conductor to ... ', then it is still true even after deployment. the last conductor that did something hasn't changed.16:23
lucasagomesyeah16:23
lucasagomesI think that's the definition, but still hmmm16:23
* lucasagomes thinking16:23
lucasagomesI see ppl trying to use it as "get all the nodes managed by this conductor"16:24
lucasagomesand using that field16:24
lucasagomesbut as the node is not deployed that field might be old16:24
devanandamorning, all16:24
lucasagomesidk I've mixing feelings about it, I got think more16:24
lucasagomesdevananda, morning16:24
rloomorning devananda. you're on the hot seat with your conductor_affinity ;)16:24
devanandalucasagomes: right now, that field could indicate an active deploy OR an active console session OR both16:25
openstackgerritLucas Alvares Gomes proposed a change to openstack/ironic: Add periodic task to rebuild conductor local state  https://review.openstack.org/12461016:25
devanandalucasagomes: in this case, people shouldn't use it --it's not exposed in the API,a nd I dont think it should be16:25
rlooor non-active.16:25
NobodyCamgood morning devananda :)16:25
lucasagomesdevananda, right, the console bit is another thing... cause it doesn't work in the take_over()16:25
lucasagomesI think that for this realease we may want to constraint the use of that field16:26
lucasagomesonly for the deploy/undeploy bit16:26
lucasagomesand once we fix the console we can set the affinity there as part of that patch16:26
devanandasure16:26
devanandathat's fine, too16:26
lucasagomesleaving a TODO/FIXME on the console part is ok16:26
devanandabut we also still don't need to unset it :)16:26
lucasagomesbut having partial fixes :/16:26
lucasagomesyeah we don't, I have to think more about it16:27
lucasagomesmy fear is ppl misusing it to find the nodes that controler X is managing and using that field16:27
lucasagomesbut I might be being a bit paranoid here16:27
JayFI would absolutely misuse it in that way16:27
lucasagomes^16:28
*** marcoemorais has joined #openstack-ironic16:28
lucasagomesyeah so, by setting when deployed and unset when undeployed makes it simple16:28
JayFOut of date data is slightly better than no data at all16:28
lucasagomesat least for this release16:28
lucasagomesJayF, +116:28
JayFlucasagomes: for IPA, conductors are needed even for inactive nodes16:28
lucasagomeswell...16:28
JayFlucasagomes: to handle heartbeats from running agents16:28
devanandalucasagomes: so a TODO or NOTE to change the default interval seems poor. standard is 60s. I think we leave it there, let people change it if needed.16:28
devanandaJayF: so you would update that field even for heartbeats?16:29
Shrewslucasagomes: 'nova delete' leaves conductor_affinity set16:29
Shrewsnot sure if that's correct?16:29
lucasagomesShrews, yeah we are talking bout it now16:29
devanandaShrews: that was my intent in rev8, yes16:29
devanandaas I see it, there's never a reason to unset it16:29
* Shrews reads sb16:30
JayFdevananda: I don't think we'd update it, just noting that whereas for pxe driver, the undeployed server is essentially just a paperweight, but with IPA+long running agents, it's not neccessarily16:30
JayFthat being said, long running agents isn't in J16:30
JayFso have fun16:30
JayFlol16:30
devanandabecause there is never a time (except at the very start of the cluster ,or for new ndoes, when it's already NULL) that there was not previously a conductor to which this node was mapped16:30
devanandaand the field indicates THAT16:30
lucasagomesdevananda, leave as 60s default? yeah, well that's fine16:30
devananda"some time in the past, this conductor touched this node"16:30
devanandait's not intended to LIMIT the interaction to that conductor16:31
devanandaand it's never checked / enforced16:31
lucasagomesright16:31
devanandaif another conductor tries to manage that node (and gets the TaskManager lock) then great -- that's fine. it should update the conductor_affinity field when it's done16:31
JayFSo for pxe driver, that would typically indicate, for instance, the conductor that pxe booted the instance?16:31
lucasagomes+1 yeah, maybe checking on the node lock if the affinity still correct16:31
devanandaJayF: that last pxe booted it. even if the node is now off and has no instance. yes.16:32
JayFdevananda: that makes sense, thanks :)16:32
devanandalucasagomes: no need to check it at all ever16:32
devanandathat's my point16:32
openstackgerritSam Betts proposed a change to openstack/ironic: Add a call to management.validate(task)  https://review.openstack.org/12538716:32
devanandaoutside of "the ring changed, do I need to take_over()"16:32
devanandathere is no need to read taht field16:32
lucasagomesyeah... I see it, I was most wondering on leaving that old data in the db16:33
lucasagomesand ppl building things on top of it... but yeah16:33
lucasagomeswell we can leave it then16:33
devanandabut it's not old data -- it has actual meaning16:33
devanandai'm sorry that I haven't communicated all this more clearly, and that it's left to the last minute like this.16:34
lucasagomesright, yeah "the last conductor that touched this node"16:34
lucasagomesdevananda, it's grand... as I said I had to think more about it16:34
lucasagomesso talking about it helps with that16:34
*** romcheg has joined #openstack-ironic16:34
* devananda needs to blog on how take over works. /me just need to blog.16:35
lucasagomeslol16:35
lucasagomesg+?16:35
* NobodyCam would read that blog16:35
rloowhy a blog? why not part of our doc?16:36
devanandait *also* should become doc'd16:37
NobodyCamanyone seen Jbjohnson on of late?16:37
lucasagomesNobodyCam, nop16:38
NobodyCam:-p16:38
devanandalucasagomes: so what's the next step. it looks like you've added a few revs to this -- should I take a look, or are you posting another?16:38
devanandaI really really want to land *something* now.16:38
lucasagomesdevananda, I added tests to it16:38
devanandaso we can cut RC116:38
lucasagomes+116:39
devanandaand unblock everything else :)16:39
lucasagomesdevananda, the neutron bit still unsure tho, I didn't test it yet :(16:39
devanandaI also think the reviews and questions have been super helpful16:39
devanandalucasagomes: ah, yea, i didn't get time yesterday16:39
devanandalucasagomes: i'll be in meetings again in an hour. might have time this afternoon... i hope16:40
lucasagomesdevananda, ack, no problem16:40
*** todd_dsm has quit IRC16:40
rloolucasagomes: does the periodic task stuff work with this new interpretation of conductor_affinity? (Am asking before I review.)16:41
lucasagomesdevananda, so the reviews I added has tests, and one behavior change if u set that internval to 0 the periodic task will be non-op16:41
lucasagomesin case ppl want to disable it16:41
devanandaoh, good. thanks!16:41
lucasagomesrloo, yup, the perirodic task filters for nodes associated with an instance16:41
devanandabeing able to disable it ++16:41
lucasagomesrloo, and check the affinity, if it's not set to the conductor that is suppose to manage it it just skip that node16:42
lucasagomesif it's the right conductor it updates that field16:42
lucasagomesrloo, so I think it's all fine16:42
rloolucasagomes: thx. will look after I grab some food!16:43
devanandaShrews: aren't the FK index names auto-generated?16:43
lucasagomesrloo, ack! thanks16:43
devanandaShrews: I just realized it is "iso_fk" which seems ... not auto generated16:43
*** romcheg has quit IRC16:44
*** yuanying has quit IRC16:45
*** romcheg has joined #openstack-ironic16:46
NobodyCamshould we fix/land https://review.openstack.org/#/c/118614 before we cut rc?16:49
*** derekh has quit IRC16:49
*** romcheg has quit IRC16:49
devanandalucasagomes: I'm fixing up 124493, unless you're actively working on it, then I'll wait16:50
lucasagomesdevananda, no, I started but didn't progress much16:50
lucasagomesI will have to go soon16:51
lucasagomesdevananda, so, go ahead16:51
devanandak k16:51
lucasagomesaight I think I should go (still need to finish packing his stuff to move out)16:52
lucasagomeshave a good night everyone16:52
*** lucasagomes is now known as lucas-afk16:52
*** andreykurilin_ has quit IRC16:53
NobodyCamhave a good noght lucas-afk16:54
*** jcoufal has quit IRC16:54
NobodyCamnight even16:54
Shrewsdevananda: they have to be named so that we can manually drop the FK before dropping the column in the migration script16:55
devanandaShrews: sure. i mean, i thought the names were generated systematicaly16:56
devanandathere was some work in oslo to do that a year or two ago16:56
openstackgerritGhe Rivero proposed a change to openstack/ironic: Update hacking version in test requirements  https://review.openstack.org/12542316:56
Shrewsdevananda: but what is the name we'd use when dropping the FK? is it guessable?16:56
devanandaShrews: i'm not saying this clearly :(16:57
Shrewsdevananda: please use new words  :-P16:57
devanandaShrews: eg, http://lists.openstack.org/pipermail/openstack-dev/2013-December/022647.html16:58
devanandahttps://review.openstack.org/#/c/84444/16:58
devananda"will have a name such as <tablename>_<columnname>_fk"16:58
*** vdrok_ has joined #openstack-ironic16:59
*** vdrok__ has joined #openstack-ironic16:59
Shrewsdevananda: from what i recall, it was "nodes_ibfk_2" or similar17:00
*** vdrok__ has quit IRC17:00
devanandathat's ... not very helpful17:00
Shrewsthere is an existing nodes_ibfk_1 FK17:00
devanandaright. auto generated names ftl17:01
devananda"ibfk_1" is terrible17:01
devanandanodes_conductor_affinity_fk is better17:01
* devananda renames17:01
rlooNobodyCam: wrt 118614, that's docn, it can be done after RC1 is cut. It'll show up whenever it is merged.17:02
NobodyCam:) rloo ack17:02
Shrewsdevananda: did you read note #3 on that email link? "Foreign keys shouldn't be created against nullable columns."  We are doing that.17:05
devanandaon or against?17:06
Shrewsah, against here. nm17:06
devanandawe could make online non-NULLable17:07
devanandadefault True17:07
devanandaI think that preserves the same behavior for new clusters. it might be better when upgrading an existing cluster17:07
Shrewsthat sounds reasonable for upgrade situations17:08
*** rushiagr_away is now known as rushiagr17:10
*** jistr has quit IRC17:13
*** harlowja has joined #openstack-ironic17:15
*** killer_prince has quit IRC17:23
*** lsmola has quit IRC17:31
devanandathis is an odd error: InvalidRequestError: Object '<Conductor at 0x7f43aab5c650>' is already attached to session '341' (this is '342')17:35
devanandahappens every time, if I select, then try to update that result, without an explicit session object17:36
devanandadtantsur|afk: ^ that's why we (currently) need a session17:36
devanandaI suspect it's a bug, but am not going to dig into it further now17:36
*** killer_prince has joined #openstack-ironic17:36
*** killer_prince is now known as lazy_prince17:37
*** yuanying has joined #openstack-ironic17:42
*** vdrok_ has quit IRC17:45
*** vdrok_ has joined #openstack-ironic17:46
*** eghobo has joined #openstack-ironic17:46
*** vdrok_ has quit IRC17:46
*** dlaube has joined #openstack-ironic18:01
*** tatyana has joined #openstack-ironic18:05
*** ChuckC has quit IRC18:05
openstackgerritDevananda van der Veen proposed a change to openstack/ironic: Add "affinity" tracking to nodes and conductors  https://review.openstack.org/12449318:05
openstackgerritDevananda van der Veen proposed a change to openstack/ironic: Add "affinity" tracking to nodes and conductors  https://review.openstack.org/12449318:09
*** tatyana has quit IRC18:10
openstackgerritDevananda van der Veen proposed a change to openstack/ironic: Add periodic task to rebuild conductor local state  https://review.openstack.org/12461018:11
openstackgerritDevananda van der Veen proposed a change to openstack/ironic: Make hash ring mapping be more consistent  https://review.openstack.org/11893218:13
devanandarloo, lucas-afk: fixed up 124493 based on comments and our discussion. then rebased the others on top18:13
devanandaafk for meetings and lunch...18:13
rloothx devananda18:13
*** yuanying has quit IRC18:16
*** rushiagr is now known as rushiagr_away18:19
*** ChuckC has joined #openstack-ironic18:25
*** pelix has quit IRC18:25
*** ChuckC has quit IRC18:25
*** ChuckC has joined #openstack-ironic18:25
*** chuckC_ has joined #openstack-ironic18:27
*** todd_dsm has joined #openstack-ironic18:27
*** Ng has quit IRC18:40
*** Ng has joined #openstack-ironic18:40
NobodyCambrb18:59
openstackgerritJim Rollenhagen proposed a change to openstack/ironic-python-agent: Use poll instead of threading.Event in heartbeat thread  https://review.openstack.org/11334319:02
openstackgerritJim Rollenhagen proposed a change to openstack/ironic-python-agent: Force heartbeat immediately after async command completes  https://review.openstack.org/12546419:02
*** yuanying has joined #openstack-ironic19:12
NobodyCambrb19:39
*** yuanying has quit IRC19:46
ShrewsI'm confused by https://review.openstack.org/125423. Does 'hacking' not get updated automatically?20:14
Shrewsgit log says it was updated automatically like that last Nov. did it change to manual at some point?20:16
*** igordcard has joined #openstack-ironic20:27
*** romcheg has joined #openstack-ironic20:28
Shrewsfyi, hacking is a special case, as confirmed by infra20:34
* Shrews gets lonely20:36
NobodyCam:(20:36
NobodyCamso do we need to update hacking20:36
NobodyCamor will infra?20:37
ShrewsNobodyCam: we should. i left a comment explaining why20:37
* NobodyCam tries to cheer up Shrews 20:37
*** rainya has joined #openstack-ironic20:37
NobodyCam:)20:37
*** romcheg has quit IRC20:38
jrollwonder how badly IPA fails with later hacking20:38
*** romcheg has joined #openstack-ironic20:38
*** romcheg has quit IRC20:38
*** yuanying has joined #openstack-ironic20:42
*** mikedillion has quit IRC20:51
*** todd_dsm has quit IRC20:55
*** marcoemorais has quit IRC21:02
*** andreykurilin_ has joined #openstack-ironic21:02
*** jasondotstar has quit IRC21:02
*** marcoemorais has joined #openstack-ironic21:02
*** marcoemorais has quit IRC21:03
NobodyCambrb21:03
*** marcoemorais has joined #openstack-ironic21:05
*** marcoemorais has quit IRC21:05
*** marcoemorais has joined #openstack-ironic21:05
*** marcoemorais has quit IRC21:06
*** marcoemorais has joined #openstack-ironic21:06
*** pensu has quit IRC21:11
* devananda returns from meetingland21:14
devanandaanyone around to do reviews on the RC blocking patches?21:14
*** yuanying has quit IRC21:16
Shrewsyup21:17
devanandagreat21:17
Shrewsdevananda: i left a +1 on the first in the series (affinity). looking at the period task one now21:19
* devananda looks at the first patch, sees a +1 and a -121:19
Shrewsthe 2nd has all sorts of test failures.21:20
devanandaurgh21:20
devanandak. passed everything locally21:21
Shrewssome don't look related21:21
devanandarloo: re: your comments on https://review.openstack.org/#/c/124493/12..16/ironic/db/sqlalchemy/api.py - can you explain why a heartbeat shouldn't set online=True?21:21
rloodevananda: can you explain why it should? shouldn't it already be set properly?21:22
devanandarloo: it should indeed. but if it's not, should that conductor's heartbeats not be counted?21:22
rloodevananda: no, I don't think so. if it isn't, there's a bug or I'm not understanding something.21:23
devanandait would be a bug, yep21:23
rloodevananda: so I'd rather know about the bug and fix it, than to 'cover' it here.21:23
devanandaand the result would be the conductor service is running, but gets removed from the hash ring21:23
devanandaand doesn't log anything21:23
devanandaactually - i can revert that part of the change21:24
rloodevananda: but the conductor service was offline?21:24
devanandaand it will raise ConductorNotFound, whcih should at least get logged from the periodic task21:24
devanandaI prefer a service to be self-correcting, rather than require an operator to notice the log entry21:24
devanandaand I don't see any harm in it21:25
devanandaif the conductor service is actually running and heartbeating, but somehow the DB is out of sync, why should it require an operator to come "fix" it?21:25
devanandawhy not just self-correct?21:25
rloodevananda: guess I'm worried about either a race condition or buggy code. So I'd rather not have some self-heal for something I'm not aware of.21:25
Shrewsa network blip might hide one conductor. split brain type thing. we'd want it to come back when the network is fixed21:25
jrollself-healing++21:26
devanandarloo: I would rather have it self-heal when it can.21:26
rloobut *when* does a conductor's online get set to False?21:26
devanandarloo: when it calls unregister21:26
rlooonly with some change in the db directly or when unregister is called.21:26
devanandawhat if I'm trying to restart the service and somehow those messages get reversed21:26
devanandaand now there's a running service but the DB record says "online=False"21:27
devanandarloo: my point isn't "recover from known failure" -- it is that this is a trivial way a service can recover from unknown failurse21:27
devanandanow that the heartbeat is mroe than just the updated_at field, we should set both fields21:28
rloodevananda: Ok, if you see it that way. We should probably log a message in that case? if it was False?)21:28
Shrewsdevananda: actually, my comment makes me think... what if we do have a split-brain situation? wouldn't both conductors attempt to take over the other's nodes?21:29
devanandaif we actually have two conductor services with the same *hostname*, all bets are off21:29
rloodevananda: what if an unregister happens, but while unregistering, the touch_conductor() gets called?21:30
*** andreykurilin_ has quit IRC21:30
devanandarloo: then, instead of that conductor appearing offline immediately, it will continue to appear to be online until the timeout is reached21:31
devanandarloo: and after that, it'll still appear offline21:31
devanandaso the system will take a bit longer to converge on the correct state, but that's fine21:31
devanandasee get_active_driver_dict -- it filters by both online=True and updated_at21:32
devanandaShrews: we're relying on the operator configuring each service with a unique hostname21:33
rloodevananda: sorry, i was thinking what if the dbapi.unregister_conductor gets called first (which sets online=False), then the touch_conductor() gets called which sets online=True. But maybe that will never happen.21:33
devanandaShrews: so a split brain here could happen eg. if using galera, and the heartbeat updates stop propagating across galera nodes21:33
devanandaShrews: so (this pool) and (that pool) of conductors each try to map all nodes onto their own half of the ring21:34
rloodevananda: anyway. If you feel strongly about adding online=True in touch_conductor (and jroll seems to like it too), I'd like to at least log if it was false->true. Since we're talking about something unexpected happening.21:35
devanandaShrews: that'd suck. let's not do that. also -- if the ironic DB gets that busy, it means someone is runing a *massive* cluster.21:35
devanandarloo: so, logging that will require issuing a SELECT query first. there isn't one right now.21:36
jrollrloo: I tend to like self-healing, I'm not very aware of the details for this right now21:36
openstackgerritJosh Gachnang proposed a change to openstack/ironic-python-agent: Add command metrics to IPA API  https://review.openstack.org/11998121:36
rloodevananda: sigh. ok then. I don't think we want to issue a SELECT each time.21:36
devanandajroll: short version: heartbeat used to only set updated_at=now(). I'm adding another "online" field so we can track intentional shutdown vs. it-just-went-away21:36
devanandajroll: and I'm suggesting that the heartbeat just default to set "online=True" -- not that we expect it to have another value, but just in case21:37
jrolldevananda: yeah, I think that's fine. we see the conductor is alive, mark it alive21:38
jrollright?21:38
devanandajroll: this is called only from the conductor's heartbeat periodic task21:38
jrolldevananda: right, in that case we know it's alive21:38
devanandarloo: the code won't directly call unregister and then touch_conductor -- but if we were to move to a properly threaded db connector, then in theory, it might be possible21:39
devanandarloo: but I don't think that would actually result in a problem -- the cluster will still eventually notice that the conductor is offline21:40
devanandait's eventual state would be  (online: True, updated_at: really-old-timestamp)21:40
devanandajroll: right21:40
rlooif we're ok with that eventual state, then that's fine.21:42
openstackgerritA change was merged to openstack/ironic: Update hacking version in test requirements  https://review.openstack.org/12542321:42
*** shikui_ has joined #openstack-ironic21:43
NobodyCamdevananda would we ever want to get a conductor that is offline? question comes from line 529 of https://review.openstack.org/#/c/124493/16/ironic/db/sqlalchemy/api.py21:44
devanandaNobodyCam: you mean get a list of conductor services that are offline?21:45
NobodyCamgah /me luvs it when his irc forgets to scroll the screen21:45
NobodyCamdevananda: yes21:46
devanandaNobodyCam: I can't predict the future :)21:46
Shrewslousy PTL!!!! :-P21:48
* Shrews hands devananda a Magic 8-Ball21:48
devanandaShrews: THe 8-ball says: Something will definitely happen soon.21:49
Shrewshttp://www.ask8ball.net/21:49
devanandarloo: so, instead of filtering on associated (since your'e correct, that's technically orthogonal to provision state, even though Nova sets it first)21:51
NobodyCamdevananda: are you opposed to something like : http://paste.openstack.org/show/gwLgksRa7oGR7tH26hRP/21:51
rloodevananda: wrt 124493, i'm ready to +2 if you can tell me what the plan is wrt unit tests.21:51
devanandarloo: you'd just use prov_state = ACTIVE ?21:51
devanandarloo: the plan is: I write them now :)21:52
rloodevananda: well, that's what is being checked after getting the lock21:52
rloodevananda: ok, so I'll wait for the tests then ;)21:52
devanandayou have some good points on 124610 that need to be addressed21:53
*** mordred has quit IRC21:54
*** mordred has joined #openstack-ironic21:54
devanandarloo: actually, question for you abotu the tests21:55
rloodevananda: "I don't know" :-)21:55
devanandahehe21:56
devanandarloo: I think the only API change is adding "update_existing" parameter21:57
devanandarloo: are there other unit test changes you want to see?21:57
rloodevananda: I was right, I don't know. I didn't actually think about them. Just the lack of any seemed odd ;) Let me put on my thinking cap...21:58
*** marcoemorais has quit IRC22:05
*** marcoemorais has joined #openstack-ironic22:06
*** marcoemorais has quit IRC22:06
*** marcoemorais has joined #openstack-ironic22:07
*** marcoemorais has quit IRC22:07
*** marcoemorais has joined #openstack-ironic22:07
*** mikedillion has joined #openstack-ironic22:10
*** yuanying has joined #openstack-ironic22:12
*** marcoemorais has quit IRC22:13
*** marcoemorais has joined #openstack-ironic22:13
rloodevananda: is there or do we need to add a test for the new db changes? (in tests/db/sqlalchemy/test_migrations.py22:15
openstackgerritDevananda van der Veen proposed a change to openstack/ironic: Add "affinity" tracking to nodes and conductors  https://review.openstack.org/12449322:15
devanandarloo: ^22:15
devanandarloo: no. all migrations should be automatically tested22:16
rloodevananda: I looked quickly and I think you got them covered. I just added a comment (to the previous revision) with a few minor suggestions wrt tests. I have to go now but will review it 1-2 hrs from now I think.22:20
*** rloo is now known as rloo_afk22:20
*** marcoemorais has quit IRC22:20
devanandarloo_afk: thanks much22:20
NobodyCambrb22:23
*** marcoemorais has joined #openstack-ironic22:31
NobodyCamdevananda: line 233 of https://review.openstack.org/#/c/124493/17/ironic/conductor/manager.py should we log something about unable to register conductor?22:32
*** marcoemorais has quit IRC22:33
devanandaNobodyCam: it's already "shut down" so why log it again?22:33
devanandaalso, we shouldn't be adding more translatable strings now22:34
devanandaalso, changing whether it logs there is unrelated to the hash ring rebalancing22:34
NobodyCamahh oh kay :)22:36
jrolldevananda: you remember off hand the difference between nova reschedules / does not reschedule?22:40
jrollI remember it was a certain type of exception22:40
jrollunrelated, as much as I appreciate a string freeze, avoiding adding logging due to that sounds like a bad decision, if it's good loggin22:41
jrollg22:41
jroll("why log it again" is a fine reason)22:42
devanandajroll: I don't disagree22:42
jrollok :)22:42
*** tatyana has joined #openstack-ironic22:42
devanandajroll: but also, i feel frustrated in that we haven't fixed this yet22:42
*** tatyana has quit IRC22:42
jrolldevananda: ha, indeed22:42
jrolldevananda: can I help? just need code reviews?22:43
* jroll blindly +A's everything22:43
devanandaand I"m getign frustrated with nit picks :)22:43
devanandajroll: yes. reviews are great22:43
devanandajroll: and if you can test it with two conductors on different IPs to amke sure it's working with Neutron properly, that'd be grand22:44
devanandai haven't had time, and neither has lucas22:44
devanandaand that's the last functional piece we haven't tested22:44
*** Guest84781 is now known as annegentle22:44
jrollif I had a pxe driver environment I'd love to :(22:44
*** yuanying has quit IRC22:45
devananda:(22:47
*** yuanying has joined #openstack-ironic22:48
jrolldevananda: I wonder if this would work with a devstack + a conductor on a separate VM22:50
jrollI guess it could22:51
jrolllet me see what I can do22:51
NobodyCam124493 reviewed22:51
devanandajroll: probably could. cheers22:53
openstackgerritDevananda van der Veen proposed a change to openstack/ironic: Add periodic task to rebuild conductor local state  https://review.openstack.org/12461022:53
devanandaNobodyCam: tyvm22:54
NobodyCamlol I was just lookst at ^^^ did you happen to add a doc string to def _get_ksclient():22:55
NobodyCamits an internal function and the name is clear enough ...22:55
NobodyCambut every other function has at least a one liner22:56
jrollidk what a ksclient is :P22:56
NobodyCam*every other function in that file22:56
NobodyCam:-p22:56
openstackgerritDevananda van der Veen proposed a change to openstack/ironic: Make hash ring mapping be more consistent  https://review.openstack.org/11893222:57
jrolldevananda: mind rebasing 118932 real oh there you go :P22:57
devanandaNobodyCam: lol22:58
NobodyCamis line 877 of https://review.openstack.org/#/c/124610/13/ironic/conductor/manager.py missing a _22:59
*** lucas-afk has quit IRC23:02
devanandaNobodyCam: nope23:04
devanandait's *not* translated23:04
devanandaI can replace the , with %23:05
jrolloh lord, I need to mysql/rabbit across the internet :|23:05
jroll(to do this easily)23:05
devanandajroll: urg. yup23:05
NobodyCamfor me it was the () that wraped the string that thru me23:05
devanandaah23:06
Shrewssorry, had to do the dinner thing. around now.23:08
devanandadhellmann: going through your email/'pad now23:09
Shrewsdevananda: although i'm listed as co-author on 124493, i'm willing to +2A that thing at this point if no one else is reviewing it23:09
NobodyCamWB Shrews23:09
*** mikedillion has quit IRC23:09
NobodyCamShrews: rloo said she would look at it in a couple of hours23:10
ShrewsNobodyCam: ack23:10
devanandaShrews: looks like rloo wanted unit tests, but otherwise +'d it23:11
devanandaso ya, probably good to give her a chance to review the unit tests I added23:11
Shrews124610 is looking good to me. not sure how to test that, though23:30
* Shrews wants to add a +1.5 vote23:30
Shrewsbrb23:31
rloo_afkI'm back but I would NOT be offended if you +2 and +A'd something in my absence!23:37
*** rloo_afk is now known as rloo23:37
*** dprince_ has quit IRC23:39
*** romcheg has joined #openstack-ironic23:40
NobodyCamdevananda: correct me if I am wrong, if a node in is a state like deploying and the conductor handling the deploy dies, then affinity will not get updated untill the deploy actually times out and the p-task runs after the time out23:40
*** romcheg has quit IRC23:48
*** mikedillion has joined #openstack-ironic23:55

Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!