Monday, 2015-11-30

openstackgerritOleksii Zamiatin proposed openstack/oslo.messaging: WIP: [zmq] PUB-SUB pipeline.
lxslidhellmann: we haven't released mutation_hook yet16:15
lxslidhellmann: should I push a patch to remove it, to save API-breakage later? in case oslo.config gets auto-released soon16:15
lxslidhellmann: no idea about taskflow or that multi-callback system you mentioned. I'd rather avoid getting diverted without at least a lot more details (dims?)16:16
openstackgerritChangBo Guo(gcb) proposed openstack/oslo.versionedobjects: Use version convert methods from oslo_utils.versionutils
dhellmannlxsli : I think we just talked about your questions in the meeting, but for the record here in case anyone is looking at the logs later: I thought we agreed at the summit that oslo.config didn't need callbacks, that the service-level code receiving the instruction to reload the config would tell oslo.config to do that, and then when that was successful it would invoke any other functions needed to update the application or library code that16:53
dhellmann needed to know about new option values.16:53
lxslidhellmann: yep thanks16:53
lxslidhellmann: I need a bit of help if possible - I was passing a 'fresh' dict of changed values to the hook. I can't see a good way to pass that out.16:54
lxsliI could return it from reload_config_files but you'll get a False-y value ({}) if the reload succeeded but nothing changed16:55
dhellmannlxsli : we should raise an exception if reload fails. I'm not sure why we need to know what config settings changed, though?16:55
lxslidhellmann: current behaviour is to return False if reload fails16:55
dhellmannlxsli : we might need a new method, then, if we definitely need to know what options have changed. Why do we need that?16:56
lxslidhellmann: I thought it'd be more convenient to have an updated dict than to rifle through the whole config; but that might not be true thinking about it16:56
dhellmannsome types of mutable options are going to require logic to deal with the updates (log levels changing). That logic should just be idempotent so it doesn't matter if it's called repeatedly with the same value.16:58
lxslilet's fix it when we find a case that needs it then16:58
dhellmannother types of mutable options don't need any logic at all, if they're a static value that is just read from the configopts instance when it's needed (I can't think of a good example of this)16:58
lxslithank you16:59
dhellmannalthough now we have a new failure case that needs to be logged, which is "a nonmutable option was changed during the reload"16:59
dhellmannoslo.config can do that itself, I suppose16:59
lxslidhellmann: I'm already logging a warning for that16:59
dhellmannI apologize for not keeping up with the development of this :-/17:00
lxslidhellmann: np you have plenty on your plate I'm sure!17:00
lxslidhellmann: oslo have been very responsive compared to, say, Nova :)17:01
dhellmannlxsli : we do try! and thank you for working with us. :-)17:01
openstackgerritRoman Podoliaka proposed openstack/oslo.db: Refactor deps to use extras and env markers
openstackgerritAlexis Lee proposed openstack/oslo.config: Remove mutation hook
* dhellmann wanders off in search of food17:02
lxslidhellmann: when you get back - ^^17:02
lxslidims: if you'd also be kind enough to look at that, it's important this merges before 3.0.1 is released17:03
dimslxsli thanks!17:04
dimslxsli : will wait for the changes to get through CI17:04
lxslidims: yes for sure, thank you :)17:05
lxslibtw I'm working on a patch to oslo.log to make 'debug' (and 'verbose') mutable17:06
lifelesshaypo: testr run --analyze-isolation17:13
haypolifeless: hi. what is --analyze-isolation?17:13
haypolifeless: i made some minor progress on my issue. tests fail with: python testr --slowest --testr-args='glance.tests.unit', but they don't fail anymore if i add --no-parallel17:14
lifelesshaypo: it will use the set of failures and the traces they occured in to detect cross test interactions17:14
haypolifeless: i spent 5 hours to try to reproduce the failure. it's really amazing how random it is17:15
lifelesshaypo: you can re-inject the failure into the db with testr load17:15
lifelesshaypo: then try testr run --analyze-isolation, it may get a result quickly for you17:15
haypoi just retried 'rm -rf .testrepository/; date; tox -r -e py34; date' 4 times with --no-parallel, but it doesn't fail anymore17:16
haypowithout --no-parallel (so with parallel test runners), it only takes 2 runs to get a failure17:16
haypolifeless: before i reproduced. then i used testr last --subunit|subunit2pyunit. then i manually extracted the list of tests from it, i rerun tests in the same order with it doesn't fail anymore!?17:18
haypo(before, i reproduced the failure*)17:18
haypoi also copied PYTHONHASHSEED, it doesn't help17:18
haypolifeless: it's unclear to me how testr combines results from parallel test runners17:18
lifelesshaypo: backends can run in any order17:19
lifelesshaypo: its basically set based; the new scheduler will change that a bit but its not ready yet17:19
haypolifeless: i'm not sure that running the same tests in the exact same order is enough to reproduce the bug17:19
lifelesshaypo: you're not doing that anyway,  you're just loading them into the same backend17:20
lifelesshaypo: anyhow, once you've had a failure, run testr run --analyze-isolation17:20
lifelesshaypo: I know I sound like a broken record17:20
lifelesshaypo: but it exists precisely for the sort of scenario you're describing; see the docs for a detailed description17:21
haypolifeless: ok. since you mentioned --analyze-isolation i'm trying to reproduce the bug, but i'm unable to reproduce it anymore :-)17:22
lifelesshaypo: stop using --no-parallel :)17:22
lifelesshaypo: note that one of the differences between testr run and test is that defaults to parallel17:23
haypolifeless: it's exactly what i'm doing right now. 3 runs later, the bug doesn't want to appear again17:23
haypoas i said, it's really hard to reproduce it :-/17:23
lifelesshaypo: if you hadn't deleted your testr db17:24
lifelesshaypo: we could just have pulled a failed run out of it17:24
lifelesshaypo: suggestion for next time: don't do that17:24
haypomaybe. i'm really lost after 5 hours :-p17:24
haypolifeless: i'm trying random things and i'm getting random output17:25
haypoi became a random number generator17:25
haypolifeless: i'm using 'tox -r -e py34' instead of 'tox -e py34', it's insane, but right now, it's the most efficient way to reproduce the bug...17:26
lifelesshaypo: so do this17:26
haypolifeless: i noticed that sometimes tests behave differently depending on the .pyc timestamp17:26
lifelesshaypo: testr run --parallel --until-failure17:26
lifelesshaypo: and leave it for 30m17:27
haypolifeless: ah! i just reproduce the bug. now i'm running 'testr run --analyze-isolation'17:28
*** ihrachys has quit IRC17:28
haypothe full test suite takes than 3 minutes17:28
haypolifeless: FYI i'm working on a glance issue, to port unit tests to python 317:28
haypolifeless: ok, it was very helpful: "unknown - no conflicts" :-D17:28
lifelesshaypo: ok, so that means that its truely a race17:29
haypolifeless: haha17:29
haypolifeless: i wasn't kiding you17:29
lifelesshaypo: now17:29
lifelesshaypo: in the failure trace each test has a tag worker-N17:29
lifelesshaypo: all the tests with the same worker-N ran on the same backend17:29
lifelesshaypo: you can use subunit-filter < .testrepository/$ID to filter for just the tests from one worker17:30
lifelesshaypo: and then subunit-ls to get their ids17:30
lifelesshaypo: and then testr run --no-parallel --until-failure --load-list $name to loop17:31
lifelesshaypo: how many tests fail when this happens ? one or many?17:31
lifelesshaypo: have a look at their worker ids, do they all fail in the same backend ?17:31
haypolifeless: i have to go, will be here in 1 or 2 hours?17:32
lifelesshaypo: btw, one trick - the files in .testrepository are still in subunit v1, so you need to pipe the run file | subunit-1to217:32
lifelesshaypo: I will17:32
haypolifeless: see you later17:33
lifelesshaypo: kk; we'll pin this down17:33
*** ozamiatin_ has joined #openstack-oslo17:56
openstackgerritMerged openstack/oslo-incubator: Cleaning up code in oslo-incubator
*** harlowja has joined #openstack-oslo18:10
*** edmondsw has quit IRC18:11
*** pm90___ has joined #openstack-oslo18:35
harlowjadhellmann moved that lock file stuff/question/review to
harlowjasince i'd really like to know/agree on why or what people are doing to clean those up...18:44
harlowjaconnected to
harlowja(if u forget)18:44
dhellmannharlowja : yeah, the glob thing felt pretty unpolished18:44
harlowjaya, i mean i get it18:44
harlowjabut it still seems umm odd18:44
harlowjai'd like to maybe just switch to a lock file per project, and let the project use offsets into it18:44
harlowjaso there would be a /var/locks/nova.lock and thats it18:45
harlowjathat file would never get deleted (ever)18:45
harlowjaand it would be off sufficent size to be useful to said nova (or other) project18:45
harlowjabut ya, check that thread out when u have time18:46
dims#success cross-project, technical-debt-reduction effort pays dividends, no code left in oslo-incubator repo anymore19:18
openstackstatusdims: Added success to Success page19:18
harlowjadims don't lose my superawesome tool scripts, ha19:19
*** salv-orlando has joined #openstack-oslo19:20
harlowja*or other tools in
harlowjasome of those are still nice19:20
dimsharlowja : y just the sources we used to sync to projects19:20
openstackgerritDavanum Srinivas (dims) proposed openstack/oslo.messaging: Option group for notifications
dimsharlowja : seen this yet?
openstackLaunchpad bug 1520397 in debtcollector "Problem with abstract classes" [Undecided,New]19:32
harlowjahave not19:33
harlowjai'm gonna blame the wrapt library, ha19:33
harlowjadims why u break it, ha19:34
* harlowja wonders if wrapt decorator doesn't handle that case very well19:35
harlowja*universal decoratot19:35
dimsy probably19:35
harlowjalooks similar :-P19:35
harlowja'metaclass conflict when deriving from decorated classes' ...19:36
harlowjamaybe try 'Derive from A.__wrapped__.'19:36
dimsthat would look odd (and break compat too)19:37
harlowjawell might just look odd, not sure about break compat19:39
harlowjaother option i think is to not have remove() for classes and have a specialized remove_class19:40
harlowjabecause remove, using wrapt universal decorator (which i trust grahm the author to do it way better than i can) doesn't seem to handle this case well19:40
*** lucasagomes is now known as lucas-dinner19:42
harlowjadims sooo up to u :-P19:45
dimsharlowja : y let's leave it as a bug for now19:46
openstackgerritJoshua Harlow proposed openstack/oslo.concurrency: Add complementary remove lock with prefix function
dansmithrlrossit: tests? no way man.. tests are for ... yes, of course :)20:25
*** EinstCrazy has joined #openstack-oslo20:42
rlrossitdansmith: for right now I'm just going to write them around the ObjectVersionChecker, but if I get enough gusto I'll throw up other patches for testing the other fixtures20:46
*** EinstCrazy has quit IRC20:47
dougwighi oslo, I have an oslo_i18n question. Neutron has neutron/, which a lot of its subprojects have been importing and using. When creating neutron-lib, I noticed in your docs that it's not a good idea to export from a library, so made it, and swapped neutron around as well ( ). This implies that all the21:21
dougwigneutron subprojects (a bunch of vendor repos + lbaas, fwaas, and vpnaas) should also have their own, right? What's your recommendation there?21:21
*** vilobhmm11 has joined #openstack-oslo22:19
lifelesshaypo: ok so22:29
lifelessdo you have the id of a failing run ?22:29
lifelesslike 6 from your pastebin ?22:29
haypolifeless: sorry i just want to give up22:30
haypolifeless: it's just impossible to reproduce the bug22:30
lifelesshaypo: up to you; 5 minutes would get us something that could run in a loop while you sleep22:30
haypolifeless: it wouldn't help22:31
haypolifeless: i found many ways to trigger the bug22:31
lifelesshaypo: why not?22:31
lifelesshaypo: but22:31
haypobut it was never able to reproduce it twice22:31
haypoi was*22:31
lifelesshmm, where doeslockutils-wrapper come from ?22:33
lifelesshaypo: so, what I was proposing to do was to run in a loop22:33
lifelesshaypo: hopefully eliminating (after 8 hours or so) single-thread intereactions: or demonstrating that it can happen in a single thread22:33
haypolifeless: i don't understand how a loop would help22:33
lifelesseither way we'd gain data22:33
haypolifeless: i created between 2 and 10 list of tests which reproduced the bug22:34
lifelessif its cross thread, we can look at timing data from the full run failure (id 6) to determine what tests ran at the same time22:34
haypowith the PYTHONHASHSEED22:34
haypobut i was never able to see the bug reappears from a nexisting list22:34
haypolifeless: i just give up, sorry22:35
lifelessif its able to be reproduced in a single process, thats also useful data22:35
lifelesshaypo: well, not me you need to apologise to :)22:35
*** haypo has left #openstack-oslo22:37
