Your company has a Kubernetes application that pulls messages from Pub/Sub and stores them in Filestore.
Because the application is simple, it was deployed as a single pod.
The infrastructure team has analyzed Pub/Sub metrics and discovered that the application cannot process the messages in real time.
Most of them wait for minutes before being processed.
You need to scale the elaboration process that is I/O-intensive.
What should you do?
Click on the arrows to vote for the correct answer
A. B. C. D.C.
https://cloud.google.com/kubernetes-engine/docs/how-to/cluster-autoscalerSure, I can provide a detailed explanation of each answer option to help you understand which one is the best choice for scaling the Kubernetes application that pulls messages from Pub/Sub and stores them in Filestore.
Option A: Use kubectl autoscale deployment APP_NAME --max 6 --min 2 --cpu-percent 50 to configure Kubernetes autoscaling deployment.
This option suggests using the kubectl command to configure autoscaling for the Kubernetes deployment of the application. The command sets the minimum and maximum number of replicas to 2 and 6 respectively and sets the target CPU utilization to 50%. When the CPU utilization of the pod exceeds 50%, Kubernetes will automatically scale up the number of replicas to handle the increased load.
However, this option is not the best choice for scaling an I/O-intensive workload like the one described in the question. CPU utilization may not be the most accurate metric to determine when to scale the application. Scaling based on CPU utilization may also result in overprovisioning of resources, which can increase costs.
Option B: Configure a Kubernetes autoscaling deployment based on the subscription/push_request_latencies metric.
This option suggests configuring Kubernetes autoscaling based on the subscription/push_request_latencies metric, which measures the time between when a message is published to Pub/Sub and when it is received by the subscriber. This metric can help identify bottlenecks in the processing pipeline and determine when to scale the application.
This option is a good choice for an I/O-intensive workload like the one described in the question because it focuses on the end-to-end latency of the processing pipeline. However, it requires additional configuration and setup to enable the collection and monitoring of the metric.
Option C: Use the --enable-autoscaling flag when you create the Kubernetes cluster.
This option suggests enabling autoscaling when creating the Kubernetes cluster. This option is not sufficient to scale the Kubernetes application because it only enables autoscaling for the underlying infrastructure, not the application itself.
Option D: Configure a Kubernetes autoscaling deployment based on the subscription/num_undelivered_messages metric.
This option suggests configuring Kubernetes autoscaling based on the subscription/num_undelivered_messages metric, which measures the number of messages that have not been delivered to the subscriber. This metric can help determine when to scale the application based on the number of messages waiting to be processed.
This option is also a good choice for an I/O-intensive workload like the one described in the question because it focuses on the backlog of messages waiting to be processed. However, it also requires additional configuration and setup to enable the collection and monitoring of the metric.
In conclusion, the best option for scaling the Kubernetes application that pulls messages from Pub/Sub and stores them in Filestore is either option B or D, depending on which metric is more relevant to the processing pipeline. Option A is not the best choice for an I/O-intensive workload, and option C only enables autoscaling for the infrastructure, not the application.