*** hyunsikyang has joined #openstack-fenix | 04:28 | |
*** hyunsikyang__ has quit IRC | 04:31 | |
tojuvone | Hi JangwonLee, hyunsikyang Made some comments. Please ask if something or ask in review to show that to others there too. | 05:18 |
---|---|---|
tojuvone | lead_time is the maximum time Fenix waits before it makes action (the time Fenix waits reply). VNFM should reply immediately when ready, but within this time | 05:29 |
tojuvone | recover_time is how long it takes for VNF VM to recover after it is migrated. The recovery time after the VM is expected to be fully operational again. After migrating Fenix still counts this VM to max_impacted_members with the time defined in recover_time. After this it is not calculated to there anymore and Fenix can again migrate another VM of the same instance_greoup obeying it does not go over the max_impacted_members. | 05:35 |
*** hyunsikyang__ has joined #openstack-fenix | 05:38 | |
hyunsikyang__ | Hi tojuvone! | 05:39 |
*** hyunsikyang has quit IRC | 05:42 | |
tojuvone | hyunsikyang__: Good morning! | 06:29 |
-openstackstatus- NOTICE: Zuul is currently failing testing, please refrain from recheck and submitting of new changes until this is solved. | 09:02 | |
*** ChanServ changes topic to "Zuul is currently failing testing, please refrain from recheck and submitting of new changes until this is solved." | 09:02 | |
-openstackstatus- NOTICE: Zuul is currently failing all testing, please refrain from approving, rechecking or submitting of new changes until this is solved. | 09:13 | |
*** ChanServ changes topic to "Zuul is currently failing all testing, please refrain from approving, rechecking or submitting of new changes until this is solved." | 09:13 | |
hyunsikyang__ | Hi tojuvone, I had a meeting and urgent job. So sorry to late reply. we understand what is your point. | 10:13 |
hyunsikyang__ | WE also think about the scope of this. | 10:13 |
hyunsikyang__ | And we are not sure what is the exact scope of this. Because, if we just make general procedure, it won't work. | 10:14 |
hyunsikyang__ | So, we are trying to make a realt demo with this patches. | 10:14 |
tojuvone | Yes, so looking the review and considering cannot do easily the end solution, then what you have done is almost there. Just maybe the couple of review comments that address what can be done | 10:15 |
tojuvone | like "maintenance" service endpoint discovery... | 10:15 |
tojuvone | and if you have a server group in Nova, those details should be copied to "instance group" | 10:16 |
tojuvone | Another thing would be to continue with some example VNF that could have full blown functionality. Perhaps also with another patch set if looking to make it. | 10:17 |
hyunsikyang__ | Ok. we will update. | 10:18 |
tojuvone | The ultimate test case at the end could then even have different load for VNF, so it needs to change the instance group and instance contraints in a fly. | 10:18 |
hyunsikyang__ | And | 10:18 |
tojuvone | but I think that is so huge, one surely would not aim for it now | 10:19 |
hyunsikyang__ | yes. So | 10:19 |
hyunsikyang__ | Now what is the work around solution for that? | 10:19 |
hyunsikyang__ | How about using metadata for VNF? | 10:19 |
hyunsikyang__ | When we create VNF, we can define metadata.. | 10:20 |
tojuvone | That sounds like a fast solution. | 10:21 |
tojuvone | I guess metadata should then have needed constraints for different "groups" of VMs it have (flavor) or for antiaffinity groupped vms. | 10:22 |
tojuvone | If metadata can be changed in fly and we know it is not a busy hour for VNF when we maintain it... | 10:23 |
tojuvone | metadata could be changed in non busy hour before maintenance starts, and like "max_impacted_members" can then be bigger than it normally would | 10:24 |
tojuvone | or metadata expects the constraints to be used only for maintenance at non budy hour and thus constraints / metadata can be static | 10:25 |
tojuvone | then it is easy to use metadata for this? No need for a fancy dynamically changing constraints for this implementation | 10:26 |
hyunsikyang__ | But, we only set metadata when we create VNF. | 10:27 |
hyunsikyang__ | Hum.. | 10:27 |
tojuvone | You just write them once to Fenix or if change them, maybe remember the original metadata value before scaling and have to change constraints in fenix for "max_impacted_members" after scaling down as there is less VMs at that time | 10:28 |
tojuvone | yes, one problem is that max_impacted_members depends on how many instances of certain VM exist | 10:29 |
tojuvone | Do you follow what I mean with max_impacted_members? | 10:30 |
hyunsikyang__ | Not sure. As I understand, it is a member who affected by any action of maintenance. | 10:31 |
hyunsikyang__ | But I am not sure, how we decide impacted member. | 10:31 |
hyunsikyang__ | Is it predefined? or | 10:31 |
hyunsikyang__ | we can define it when we want such as stating scaling or any maintenance. | 10:32 |
tojuvone | oh... let me explain | 10:32 |
*** dasp has quit IRC | 10:32 | |
tojuvone | we have one type of VM, with many instances: VM1, VM2, VM3, VM4 | 10:32 |
tojuvone | VM1 and VM2 are in HOST1, VM3 and VM4 on HOST2 and HOST3 is empty | 10:33 |
tojuvone | FEnix is executing maintenance workflow | 10:33 |
tojuvone | There is only one empty host, so it cannot execute 2 hosts parallel | 10:34 |
hyunsikyang__ | yes. | 10:34 |
tojuvone | constraints are saying that max_impacted_members=2 | 10:34 |
tojuvone | FEnix decides to have maintenance on HOST1 | 10:35 |
tojuvone | according to constraints it can do "migration" to VM1 and VM parallel | 10:35 |
*** dasp has joined #openstack-fenix | 10:35 | |
tojuvone | for both of these it send some PLANNED_MAINTENANCE message separately to VNFM | 10:36 |
tojuvone | and these 2 VMs are now impacted | 10:36 |
tojuvone | VNFM have the lead_time in which is needs to reply back that it is ready | 10:36 |
tojuvone | that is defined in constraints | 10:36 |
tojuvone | after reply, Fenix makes migration and waith the recover_time | 10:37 |
tojuvone | after waited that for both of these VMs, there is 0 impacted members | 10:37 |
tojuvone | so Fenix coudl again have 2 parallel migration according to max_impacted_members | 10:38 |
tojuvone | And now if we scaled from 10 VMs, We had VM1 - VM9 | 10:38 |
tojuvone | probably the max_impacted_members might have been 8 or something like that originally | 10:39 |
tojuvone | but Fenix surely can only do as many as there is empty target hosts | 10:39 |
tojuvone | Btw when using the ETSI constraints the nfv.py workflow also has only one VM instance in PLANNED_MAINTENANCE event as all VMs are done parallel | 10:41 |
tojuvone | All possible VMs, in above it was 2 VMs at a time | 10:42 |
hyunsikyang__ | BTW, do we need to change max_impacted_member before maintenance according to the number of current intance on Host? | 10:47 |
tojuvone | max_impacted_members do not have relation to host | 10:47 |
hyunsikyang__ | in your example, | 10:48 |
tojuvone | max_impacted_members relations is how many instances of certain type of VM in VNF | 10:48 |
tojuvone | so how many instances can be impacted in "instance group" | 10:48 |
tojuvone | So that is jsut from VNF perspective to say how many instances (VMS) it needs for the service it provides | 10:49 |
tojuvone | oh... how many instances can be impacted that it can still provide the service those instances offer | 10:50 |
tojuvone | and it comes from ETSI definitions, that it is max_impacted_members while it might have been nice the other way around | 10:50 |
tojuvone | "least_number_of_instances_needed" isntead of "max_impacted_members" | 10:51 |
tojuvone | Now as it is max_impacted_members it needs changing if number of VNF instances is scaled. If defined the other way around you might not have needed to change the value. | 10:52 |
hyunsikyang__ | It means that the number of VNF needed for service as a minimum. | 10:57 |
hyunsikyang__ | right? | 10:57 |
tojuvone | yes, exatly | 10:58 |
hyunsikyang__ | So, accoding to your explaination, it is '2' because we should leave two vnf which is half of VNF for specific service. | 10:58 |
hyunsikyang__ | BUT why it is '0', after recovery time? | 10:59 |
tojuvone | "two vnf" -> "2 VMs belonging to same "instance group" in single VNF" | 11:00 |
tojuvone | oh, my bad. I think you misunderstood | 11:00 |
tojuvone | Fenix needs to internally keep count how many VMs of certain instance_group are affected against the max_impacted_members | 11:01 |
tojuvone | so Fenix workflow has some variable that keeps the ipmacted_members count | 11:02 |
tojuvone | and comapares it to max_impacted_members | 11:02 |
tojuvone | impacted_members inside Fenix is 0 after those VMs migrated and recover_time passed | 11:03 |
tojuvone | FEnix can try to migrate who konws how many VMs, but they are in gueue as if Fenxi internal count: ipmacted_members has reached max_impacted_members | 11:04 |
tojuvone | So fenix has own thread to handle each and every migration. In the beginnign it increments ipmacted_members and when thread is done it reduced ipmacted_members | 11:05 |
JangwonLee_ | I think "max_impacted_members" is just a indicator that shows the max number of migration when Fenix do maintenance an instance group. Is this right? | 11:06 |
JangwonLee_ | max number of instance | 11:06 |
tojuvone | Basically yes | 11:06 |
tojuvone | and normally VNf consist of different types of VMs that have their own instance_group and max_impacted_members according to those VMs | 11:07 |
hyunsikyang__ | So, when infra is busy, we should raise up the value to maintain the service. | 11:07 |
tojuvone | When VNF is busy it needs more instances for all kind of VMs it has | 11:08 |
tojuvone | then all those different instance groups those instances belong to, the max_impacted_members should be smaller | 11:09 |
tojuvone | or if we now consider we know maintenance is always during night | 11:09 |
tojuvone | we could statically define max_impacted_members to value that expect the service level is lower at night time | 11:10 |
tojuvone | Then we do nto need to know VNF load level, but thus we can do the maintenance only with condition, it is done at night time | 11:11 |
tojuvone | that is the usual case | 11:11 |
JangwonLee_ | yes | 11:11 |
tojuvone | If we will have sophisticate manager to know the load and max_impacted_members can be dynamically changed, we coudl run maintenance 24/7 | 11:12 |
tojuvone | Just you might be able to migrate only single VM at a time or so | 11:12 |
tojuvone | I think some non Telco users were interested of Fenix for this kind of case, where some hsot is always under mainteannce in huuuuge cloud | 11:13 |
hyunsikyang__ | I understand why Mzx_impacted_member is changed. | 11:14 |
tojuvone | Great, there is so many details in this all :) | 11:16 |
hyunsikyang__ | recheck. but one more question, If max_impacted_member = 1, fenix migrate one vnf at a one time? | 11:16 |
hyunsikyang__ | or total - max? | 11:16 |
tojuvone | I do not understand this one VNF? To me VNF is one application that is represented as one tenant/project in OpenStack | 11:17 |
tojuvone | and it have VMs of different flavor | 11:18 |
hyunsikyang__ | Ah. 1 instance. | 11:18 |
hyunsikyang__ | sorry. | 11:18 |
tojuvone | like VM1 and VM2 of flavor A, that belongs to isntance_group let's say with same name A | 11:19 |
hyunsikyang__ | VNF is kinds of application you sadi. | 11:19 |
tojuvone | yes. application consisting of diffrent type of VMs | 11:19 |
tojuvone | so you could have the VM3, and VM4 of flavor B and isntance_greoup B | 11:21 |
tojuvone | max_impacted_members would then have different value for instance_greoup A and B | 11:22 |
tojuvone | let's say instance greoup A is active /standby and always just 2 instances | 11:22 |
tojuvone | instance_group B had different amount of VMs possibly according to load. | 11:23 |
tojuvone | maybe 10 normally | 11:23 |
tojuvone | max_impacted_members would always be 1 for instance_group A | 11:23 |
tojuvone | where with instance_group B it could change when load changes and number of VMs in that group changes | 11:24 |
hyunsikyang__ | YEs. | 11:24 |
hyunsikyang__ | Right. | 11:24 |
hyunsikyang__ | So | 11:24 |
hyunsikyang__ | now, we should find a way to change the max_impacted_member... | 11:24 |
hyunsikyang__ | When fenix start maintenance, | 11:25 |
tojuvone | in instance group A there is normally just some switch over which VM is active | 11:25 |
hyunsikyang__ | in the case of group B, | 11:25 |
tojuvone | if VM1 is on host to be maintained and is active, the flotaing IP will be for eaxmple changed to VM2 and only the VNFM reply back to Fenix to migrate VM1 | 11:25 |
tojuvone | yes, there is now 2 options | 11:27 |
tojuvone | 1. Find the way to change max_impacted_members dynamically | 11:27 |
tojuvone | 2. Assume the walue of max_impacted_members is only for night time with small load and can then have static value. Surely in workflow needs to then know it is static if VNF is scaled | 11:29 |
hyunsikyang__ | In the case of 2, how we get the value | 11:29 |
hyunsikyang__ | ? | 11:29 |
hyunsikyang__ | just use fixed value in the VNFM like patch? | 11:30 |
tojuvone | yes, then constraints only need to be written once for isntance_group when VNF is created / maintenance called. | 11:33 |
tojuvone | for VM instance one always need to write contraints for new instance and remove when isntance removed | 11:34 |
tojuvone | then if we see that number of VMs are scaled during maintenance VNFM should perhaps read the static definition for max_impacted_members from the metadata when maintenance starts... | 11:37 |
tojuvone | and when number of VMs are scaled change the value accordingly | 11:37 |
tojuvone | Other way is to expect FEnix workflow is designed for the static values and the same max_impacted_members would be regardless the scaling | 11:38 |
hyunsikyang__ | In the case of VNF consists of several multi VUD as a same function, everything is fine. | 11:38 |
hyunsikyang__ | Bue In the case of VNF consists of multi VDU and it is not same flavor, it is hard to config value for each VDU. | 11:39 |
tojuvone | or it needs to be metadata in flavor | 11:40 |
hyunsikyang__ | OK. we will figure out! | 11:41 |
tojuvone | yes | 11:41 |
tojuvone | Well, you know the Tacker internals also to figure this out ;) | 11:42 |
hyunsikyang__ | Thank you for your supporting. now it is more clear. | 11:42 |
tojuvone | great and no problem. We have a common goal :) | 11:43 |
tojuvone | Would be so great to have all this in Tacker | 11:43 |
tojuvone | Thnaks you for all that you are doing | 11:43 |
*** ChanServ changes topic to "Welcome to Fenix: https://wiki.openstack.org/wiki/Fenix" | 12:25 | |
-openstackstatus- NOTICE: Zuul has been restarted, all events are lost, recheck or re-approve any changes submitted since 9:50 UTC. | 12:25 | |
*** JangwonLee_ has quit IRC | 15:05 | |
*** JangwonLee_ has joined #openstack-fenix | 15:05 | |
*** tojuvone has quit IRC | 21:09 | |
*** tojuvone has joined #openstack-fenix | 21:13 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!