google container engine - What's the recommended way to replace a bad GKE node instance? -
using gcloud container clusters resize
can scale , down cluster. find no way target specific compute instance vm removal when resizing down.
scenario: our compute engine logs indicate 1 instance suffers failure dismount volume, kubernetes pod since long gone. cluster appropriately sized, , malfunctioning node serves containers on maximum cpu load.
obviously i'd want new kubernetes node ready before kill off old one. safe resize , delete instance using gcloud compute
, or there container-aware way this?
however find no way target specific compute instance vm removal when resizing down.
there isn't way specify vm remove using gke api, can use managed instance groups api delete individual instances group (this shrink number of nodes number of instances delete, if want replace nodes, want scale cluster compensate). can find instance group name running:
$ gcloud container clusters describe cluster | grep instancegroupmanagers
is safe resize , delete instance using gcloud compute, or there container-aware way this?
if delete instance, managed instance group replace new 1 (so leave node if scale one, delete troublesome instance). if not concerned temporary loss of capacity, delete vm , let recreated.
before removing instance, can run kubectl drain remove workload instance. result in faster rescheduling of pods if deleting instance , wait controllers notice gone.
Comments
Post a Comment