Building an Azure AKS Operator for Dynamic Secrets Management with Vault and Prometheus Monitoring


A typical application deployment to Kubernetes often begins with a messy Deployment YAML. Configuration is scattered in ConfigMaps, and sensitive information is either injected through a CI/CD pipeline or, worse, base64-encoded and committed to a Git repository. Monitoring configuration is a completely separate concern, requiring an SRE team to manually create a ServiceMonitor or modify Prometheus’s static configs. The direct consequence of this separation of concerns is operational friction and potential security risks.

Consider this common deployment fragment:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: billing-service
spec:
  replicas: 3
  template:
    spec:
      containers:
      - name: billing-service
        image: my-registry/billing-service:1.2.0
        env:
        - name: DATABASE_URL
          valueFrom:
            secretKeyRef:
              name: db-credentials
              key: url
        - name: API_KEY_PROVIDER_X
          valueFrom:
            secretKeyRef:
              name: provider-x-keys
              key: api-key
# ...
---
apiVersion: v1
kind: Secret
metadata:
  name: db-credentials
type: Opaque
data:
  url: "cG9zdGdyZXM6Ly91c2VyOnBhc3NAdGhpcy1pc..." # Base64 encoded, static secret
# ...
---
# Somewhere else, in another repository, managed by another team...
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: billing-service-monitor
  labels:
    team: billing
spec:
  endpoints:
  - port: http-metrics
    interval: 15s
  selector:
    matchLabels:
      app: billing-service

The core pain points here are immediately obvious:

  1. Static Secret Management: The db-credentials Secret is static, difficult to rotate, and its lifecycle is decoupled from the application.
  2. Fragmented Configuration: The application’s deployment definition, secret sources, and monitoring configuration are three separate objects that require manual coordination.
  3. Operational Overhead: Every time a new application is onboarded or an existing one is modified, multiple teams must collaborate to update these scattered configurations, a process that is highly error-prone.

Our goal is to create a unified, declarative API that allows a developer to define a single resource describing all of an application’s “external dependencies”—including which dynamic secrets it needs and how it should be monitored. The Kubernetes Operator is the ideal framework for achieving this. We will build a VaultApp Operator that runs on Azure AKS, responsible for coordinating application deployments, injecting dynamic secrets from HashiCorp Vault, and automatically configuring service monitoring for Prometheus.

Architectural Decisions and Technology Stack

Before we start building, it’s crucial to clarify why this specific tech stack was chosen and how the components work together.

  • Azure Kubernetes Service (AKS): As a managed Kubernetes service, AKS offloads the burden of maintaining the control plane. More importantly, its deep integration with the Azure ecosystem (like Azure AD Workload Identity) provides a secure authentication mechanism for connecting to external systems like Vault.
  • HashiCorp Vault: We are deliberately not using the Vault Agent Injector. While it excels at secret injection, our goal is broader. We need a central controller to orchestrate multiple systems. An Operator can implement more complex logic, such as choosing an injection method based on secret type or performing specific actions after a secret rotation (like a rolling restart of a Deployment), all while managing Prometheus configurations that are unrelated to secrets.
  • Prometheus Operator: We don’t interact with Prometheus directly. Instead, we leverage the ServiceMonitor CRD provided by its operator. This turns monitoring configuration itself into a Kubernetes-native resource. Our VaultApp Operator simply needs to create, update, or delete ServiceMonitor objects, and the Prometheus Operator handles the rest.
  • Kubebuilder (Go): This is the leading framework for building operators. It generates the CRD definitions, controller scaffolding, and all the boilerplate code, allowing us to focus on implementing the core reconciliation logic.

The overall workflow architecture is as follows:

graph TD
    subgraph "Developer Workflow"
        A[Developer] -- writes & applies --> B(VaultApp CR YAML);
    end

    subgraph "Azure AKS Cluster"
        B -- is watched by --> C{VaultApp Operator};
        C -- reads --> B;
        C -- K8s API --> D{Reconciliation Loop};
        D -- 1. Authenticate --> E[HashiCorp Vault];
        E -- returns token --> D;
        D -- 2. Fetch Secrets --> E;
        E -- returns secrets --> D;
        D -- 3. Creates/Updates --> F[Kubernetes Secret];
        D -- 4. Creates/Updates --> G[Deployment];
        D -- 5. Creates/Updates --> H[ServiceMonitor];
        G -- mounts --> F;
        I[Prometheus Operator] -- watches --> H;
        I -- configures --> J[Prometheus];
        J -- scrapes metrics from --> G;
    end

Step 1: Defining the VaultApp API

Everything starts with API design. We need a Custom Resource Definition (CRD) to encapsulate our intent. This VaultApp resource needs to include:

  • A standard Deployment template to define the application itself.
  • A Vault configuration section to specify the secret paths to fetch from Vault and their corresponding key names in the final Kubernetes Secret.
  • A Monitoring configuration section to describe the Prometheus scrape endpoint.

In Go, this translates to struct definitions in the api/v1alpha1/vaultapp_types.go file.

// api/v1alpha1/vaultapp_types.go

package v1alpha1

import (
	appsv1 "k8s.io/api/apps/v1"
	metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)

// VaultSecretSpec defines the details of a secret to be fetched from Vault.
type VaultSecretSpec struct {
	// Path is the full path to the secret in Vault (e.g., "kv/data/billing/database").
	// +kubebuilder:validation:Required
	Path string `json:"path"`

	// Key is the specific key within the secret data to retrieve.
	// +kubebuilder:validation:Required
	Key string `json:"key"`

	// TargetKey is the key name to be used in the resulting Kubernetes Secret.
	// +kubebuilder:validation:Required
	TargetKey string `json:"targetKey"`
}

// VaultSpec defines the Vault integration configuration.
type VaultSpec struct {
	// Role is the Vault Kubernetes Auth Role to use for authentication.
	// +kubebuilder:validation:Required
	Role string `json:"role"`

	// Secrets is a list of secrets to fetch from Vault.
	// +kubebuilder:validation:MinItems=1
	Secrets []VaultSecretSpec `json:"secrets"`
}

// MonitoringSpec defines the Prometheus monitoring configuration.
type MonitoringSpec struct {
	// Enabled indicates if monitoring should be set up.
	// +kubebuilder:validation:Required
	Enabled bool `json:"enabled"`

	// Port is the name of the container port to scrape metrics from.
	// +kubebuilder:validation:Optional
	Port string `json:"port,omitempty"`

	// Path is the metrics endpoint path. Defaults to "/metrics".
	// +kubebuilder:validation:Optional
	// +kubebuilder:default:="/metrics"
	Path string `json:"path,omitempty"`
}

// VaultAppSpec defines the desired state of VaultApp
type VaultAppSpec struct {
	// DeploymentSpec is the template for the application Deployment.
	// The operator will manage a Deployment based on this spec.
	// +kubebuilder:validation:Required
	DeploymentSpec appsv1.DeploymentSpec `json:"deploymentSpec"`

	// Vault defines the integration with HashiCorp Vault.
	// +kubebuilder:validation:Required
	Vault VaultSpec `json:"vault"`

	// Monitoring defines the Prometheus ServiceMonitor configuration.
	// +kubebuilder:validation:Optional
	Monitoring *MonitoringSpec `json:"monitoring,omitempty"`
}

// VaultAppStatus defines the observed state of VaultApp
type VaultAppStatus struct {
	// Conditions represent the latest available observations of an object's state.
	Conditions []metav1.Condition `json:"conditions,omitempty"`
	// SecretName is the name of the managed Kubernetes Secret.
	SecretName string `json:"secretName,omitempty"`
	// LastSecretUpdateTime is the last time the secret was successfully updated from Vault.
	LastSecretUpdateTime *metav1.Time `json:"lastSecretUpdateTime,omitempty"`
}

//+kubebuilder:object:root=true
//+kubebuilder:subresource:status

// VaultApp is the Schema for the vaultapps API
type VaultApp struct {
	metav1.TypeMeta   `json:",inline"`
	metav1.ObjectMeta `json:"metadata,omitempty"`

	Spec   VaultAppSpec   `json:"spec,omitempty"`
	Status VaultAppStatus `json:"status,omitempty"`
}

//+kubebuilder:object:root=true

// VaultAppList contains a list of VaultApp
type VaultAppList struct {
	metav1.TypeMeta `json:",inline"`
	metav1.ListMeta `json:"metadata,omitempty"`
	Items           []VaultApp `json:"items"`
}

func init() {
	SchemeBuilder.Register(&VaultApp{}, &VaultAppList{})
}

This definition is crystal clear. It aggregates all related configurations into a single VaultApp resource. Next, we’ll implement the controller’s core logic to react to changes in this resource.

Step 2: Implementing the Core Reconciliation Loop

The heart of the controller is the Reconcile function. It is invoked whenever a VaultApp resource is created, updated, or deleted, or when a sub-resource it manages (like a Deployment or Secret) changes. Its responsibility is to read the VaultApp‘s Spec (the desired state), get the actual state from the cluster, and then perform the necessary actions to bring the actual state in line with the desired state.

// internal/controller/vaultapp_controller.go

// Reconcile is part of the main kubernetes reconciliation loop which aims to
// move the current state of the cluster closer to the desired state.
func (r *VaultAppReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
	log := log.FromContext(ctx)

	var vaultApp v1alpha1.VaultApp
	if err := r.Get(ctx, req.NamespacedName, &vaultApp); err != nil {
		if apierrors.IsNotFound(err) {
			// Object was deleted. The sub-objects are owned by it, so they will be garbage collected.
			// No need to do anything.
			log.Info("VaultApp resource not found. Ignoring since object must be deleted.")
			return ctrl.Result{}, nil
		}
		log.Error(err, "unable to fetch VaultApp")
		return ctrl.Result{}, err
	}

	// 1. Initialize Vault client
	// In a real project, Vault address and other configs should come from environment variables or a config map.
	vaultClient, err := r.getVaultClient(ctx, &vaultApp)
	if err != nil {
		log.Error(err, "failed to initialize Vault client")
		// Update status and requeue
		return ctrl.Result{}, r.updateStatusCondition(ctx, &vaultApp, "VaultClientReady", metav1.ConditionFalse, "VaultClientInitFailed", err.Error())
	}
	// ... Update status to ready ...

	// 2. Reconcile Vault Secret
	secretName := fmt.Sprintf("%s-vault-secrets", vaultApp.Name)
	secretData, err := r.fetchSecretsFromVault(vaultClient, vaultApp.Spec.Vault.Secrets)
	if err != nil {
		log.Error(err, "failed to fetch secrets from Vault")
		return ctrl.Result{RequeueAfter: 1 * time.Minute}, r.updateStatusCondition(ctx, &vaultApp, "SecretsFetched", metav1.ConditionFalse, "VaultFetchFailed", err.Error())
	}

	if err := r.reconcileK8sSecret(ctx, &vaultApp, secretName, secretData); err != nil {
		log.Error(err, "failed to reconcile Kubernetes Secret")
		return ctrl.Result{}, err // Requeue immediately on K8s API errors
	}
    // ... Update status to show secrets are fetched and secret name...

	// 3. Reconcile Deployment
	if err := r.reconcileDeployment(ctx, &vaultApp, secretName); err != nil {
		log.Error(err, "failed to reconcile Deployment")
		return ctrl.Result{}, err
	}
    // ... Update status for Deployment...

	// 4. Reconcile ServiceMonitor
	if err := r.reconcileServiceMonitor(ctx, &vaultApp); err != nil {
		log.Error(err, "failed to reconcile ServiceMonitor")
		return ctrl.Result{}, err
	}
    // ... Update status for ServiceMonitor...


	log.Info("Successfully reconciled VaultApp")
	return ctrl.Result{RequeueAfter: 5 * time.Minute}, nil // Periodically re-sync, e.g., to check for secret rotation.
}

The Reconcile function itself is a dispatcher, breaking down a complex task into several independent, idempotent sub-functions. Let’s dive into a few of the key implementations.

Core Library: Interacting with Vault

Communicating with Vault is the core function of this operator. We need a robust client that can authenticate using a Kubernetes service account token.

First, the Kubernetes auth backend needs to be configured in Vault. This is typically a one-time setup performed by the platform team:

# Enable Kubernetes auth method
$ vault auth enable kubernetes

# Configure the Kubernetes auth method to talk to our AKS cluster
$ vault write auth/kubernetes/config \
    kubernetes_host="https://<AKS_API_SERVER_URL>" \
    kubernetes_ca_cert=@ca.crt \
    token_reviewer_jwt=@reviewer-sa-token.jwt

# Create a role that binds a Vault policy to a Kubernetes service account
$ vault write auth/kubernetes/role/billing-app \
    bound_service_account_names=vaultapp-controller-manager \
    bound_service_account_namespaces=vaultapp-system \
    policies=read-billing-secrets \
    ttl=20m

Here, the billing-app role is granted to our operator’s ServiceAccount (vaultapp-controller-manager). Next is the Go implementation, which will become one of our operator’s core libraries.

// internal/controller/vault_client.go

package controller

import (
	"context"
	"fmt"
	"os"

	vault "github.com/hashicorp/vault/api"
	auth "github.com/hashicorp/vault/api/auth/kubernetes"
	"github.com/your-org/vaultapp-operator/api/v1alpha1"
)

// getVaultClient initializes and authenticates a new Vault client
// using the Kubernetes Auth Method.
func (r *VaultAppReconciler) getVaultClient(ctx context.Context, app *v1alpha1.VaultApp) (*vault.Client, error) {
	// A production-grade operator would have a more sophisticated client caching/pooling mechanism.
	// For this example, we create a new client on each reconciliation.

	config := vault.DefaultConfig()
	// VAULT_ADDR should be set in the operator's deployment manifest.
	config.Address = os.Getenv("VAULT_ADDR")
	if config.Address == "" {
		return nil, fmt.Errorf("VAULT_ADDR environment variable not set")
	}

	client, err := vault.NewClient(config)
	if err != nil {
		return nil, fmt.Errorf("failed to create vault client: %w", err)
	}

	// The path to the service account token is automatically mounted by Kubernetes.
	k8sAuth, err := auth.NewKubernetesAuth(
		app.Spec.Vault.Role,
		auth.WithServiceAccountTokenPath("/var/run/secrets/kubernetes.io/serviceaccount/token"),
	)
	if err != nil {
		return nil, fmt.Errorf("failed to create kubernetes auth method: %w", err)
	}

	authInfo, err := client.Auth().Login(ctx, k8sAuth)
	if err != nil {
		return nil, fmt.Errorf("failed to log in with kubernetes auth: %w", err)
	}
	if authInfo == nil {
		return nil, fmt.Errorf("no auth info was returned after login")
	}

	return client, nil
}

// fetchSecretsFromVault iterates through the requested secrets in the spec,
// fetches them from Vault, and returns a map ready for a Kubernetes Secret.
func (r *VaultAppReconciler) fetchSecretsFromVault(client *vault.Client, secrets []v1alpha1.VaultSecretSpec) (map[string][]byte, error) {
	secretData := make(map[string][]byte)

	for _, s := range secrets {
		// This assumes KVv2 engine. A real implementation needs to handle different secret engines.
		logical := client.Logical()
		vaultSecret, err := logical.Read(s.Path)
		if err != nil {
			return nil, fmt.Errorf("failed to read secret from path %s: %w", s.Path, err)
		}
		if vaultSecret == nil || vaultSecret.Data == nil {
			return nil, fmt.Errorf("no secret found at path %s", s.Path)
		}
		
		// For KVv2, the data is nested under a "data" key.
		data, ok := vaultSecret.Data["data"].(map[string]interface{})
		if !ok {
			return nil, fmt.Errorf("unexpected secret format at path %s, expected KVv2 format", s.Path)
		}

		value, ok := data[s.Key].(string)
		if !ok {
			return nil, fmt.Errorf("key '%s' not found or not a string in secret at path %s", s.Key, s.Path)
		}

		secretData[s.TargetKey] = []byte(value)
	}

	return secretData, nil
}

This code is very pragmatic. It hardcodes the Service Account Token path because it’s a Kubernetes standard. It assumes the use of the KVv2 engine and explicitly notes this in a comment—a common source of errors that would confuse users if not clarified. The error handling is also specific, clearly indicating whether the failure was in authentication, the path, or the key name.

Syncing the Kubernetes Secret and Deployment

Once we have the secrets, we need to write them to a Kubernetes Secret and then ensure the application’s Deployment mounts it. The controller-runtime library makes this process elegant.

// internal/controller/k8s_resources.go

func (r *VaultAppReconciler) reconcileK8sSecret(ctx context.Context, app *v1alpha1.VaultApp, name string, data map[string][]byte) error {
	secret := &corev1.Secret{
		ObjectMeta: metav1.ObjectMeta{
			Name:      name,
			Namespace: app.Namespace,
		},
	}

	// Use CreateOrUpdate to ensure the secret is in the desired state.
	// It will create the secret if it doesn't exist, or update it if it does.
	op, err := controllerutil.CreateOrUpdate(ctx, r.Client, secret, func() error {
		// Set owner reference so the secret gets garbage collected when the VaultApp is deleted.
		if err := controllerutil.SetControllerReference(app, secret, r.Scheme); err != nil {
			return err
		}
		secret.Type = corev1.SecretTypeOpaque
		secret.Data = data
		return nil
	})

	if err != nil {
		return fmt.Errorf("failed to CreateOrUpdate secret %s: %w", name, err)
	}

	log := log.FromContext(ctx)
	if op != controllerutil.OperationResultNone {
		log.Info("Kubernetes secret reconciled", "operation", op)
	}
	return nil
}

func (r *VaultAppReconciler) reconcileDeployment(ctx context.Context, app *v1alpha1.VaultApp, secretName string) error {
	dep := &appsv1.Deployment{
		ObjectMeta: metav1.ObjectMeta{
			Name:      app.Name,
			Namespace: app.Namespace,
		},
	}
	
	// Again, use CreateOrUpdate for idempotency.
	op, err := controllerutil.CreateOrUpdate(ctx, r.Client, dep, func() error {
		// Start with the user-provided spec.
		desiredSpec := *app.Spec.DeploymentSpec.DeepCopy()

		// *** CRITICAL MODIFICATION ***
		// We must inject the volume and volumeMount for our managed secret.
		// This is the core value proposition of the operator.
		volumeName := "vault-secrets"
		desiredSpec.Template.Spec.Volumes = append(desiredSpec.Template.Spec.Volumes, corev1.Volume{
			Name: volumeName,
			VolumeSource: corev1.VolumeSource{
				Secret: &corev1.SecretVolumeSource{
					SecretName: secretName,
				},
			},
		})

		// Inject into ALL containers defined in the spec.
		for i := range desiredSpec.Template.Spec.Containers {
			desiredSpec.Template.Spec.Containers[i].VolumeMounts = append(
				desiredSpec.Template.Spec.Containers[i].VolumeMounts,
				corev1.VolumeMount{
					Name:      volumeName,
					ReadOnly:  true,
					MountPath: "/etc/secrets/vault",
				},
			)
		}
        
        // This is a simple strategy for secret rotation. When the secret content changes, 
        // we add an annotation to the pod template, which triggers a rolling update.
        secretHash := calculateMapHash(app.Status.LastSecretData) // pseudo-code
        if desiredSpec.Template.Annotations == nil {
            desiredSpec.Template.Annotations = make(map[string]string)
        }
        desiredSpec.Template.Annotations["vaultapp.techweaver.io/secret-version"] = secretHash

		// Apply the modified spec
		dep.Spec = desiredSpec
		return controllerutil.SetControllerReference(app, dep, r.Scheme)
	})

	if err != nil {
		return fmt.Errorf("failed to CreateOrUpdate deployment: %w", err)
	}

	log := log.FromContext(ctx)
	if op != controllerutil.OperationResultNone {
		log.Info("Deployment reconciled", "operation", op)
	}
	return nil
}

The reconcileDeployment function is where the magic happens. It doesn’t just apply the user’s DeploymentSpec; it injects its own modifications: it forcibly adds the Volume and VolumeMount, ensuring that our secrets are always mounted at the designated path, regardless of the template the user provides.

Furthermore, by adding a hash of the secret data as an annotation on the pod template, we implement a simple but effective trigger for secret rotation. When a secret in Vault is updated, our operator will update the Kubernetes Secret. In the next reconciliation, it will compute a new hash and update the Deployment’s template.metadata.annotations. Kubernetes detects this change to the Pod template and automatically performs a rolling update, launching new pods that mount the new secret content.

Automating Prometheus Monitoring

The final step is to dynamically create the ServiceMonitor resource.

// internal/controller/monitoring.go

import (
	monitoringv1 "github.com/prometheus-operator/prometheus-operator/pkg/apis/monitoring/v1"
)

func (r *VaultAppReconciler) reconcileServiceMonitor(ctx context.Context, app *v1alpha1.VaultApp) error {
	log := log.FromContext(ctx)
	
	// If monitoring is not enabled in the spec, we should ensure the ServiceMonitor does not exist.
	if app.Spec.Monitoring == nil || !app.Spec.Monitoring.Enabled {
		sm := &monitoringv1.ServiceMonitor{
			ObjectMeta: metav1.ObjectMeta{Name: app.Name, Namespace: app.Namespace},
		}
		if err := r.Delete(ctx, sm); err != nil && !apierrors.IsNotFound(err) {
			return fmt.Errorf("failed to delete ServiceMonitor: %w", err)
		}
		if err == nil {
			log.Info("Deleted obsolete ServiceMonitor")
		}
		return nil
	}

	sm := &monitoringv1.ServiceMonitor{
		ObjectMeta: metav1.ObjectMeta{
			Name:      app.Name,
			Namespace: app.Namespace,
		},
	}

	op, err := controllerutil.CreateOrUpdate(ctx, r.Client, sm, func() error {
		// The selector must match the labels on the service that exposes the application pods.
		// A robust operator would also manage the Service resource or have a way to discover its labels.
		// For simplicity, we assume the service labels match the deployment's pod labels.
		selector := app.Spec.DeploymentSpec.Selector

		sm.Spec = monitoringv1.ServiceMonitorSpec{
			Selector: *selector,
			Endpoints: []monitoringv1.Endpoint{
				{
					Port: app.Spec.Monitoring.Port,
					Path: app.Spec.Monitoring.Path,
					Interval: "30s", // Could be configurable in the CRD
				},
			},
		}
		return controllerutil.SetControllerReference(app, sm, r.Scheme)
	})

	if err != nil {
		return fmt.Errorf("failed to CreateOrUpdate ServiceMonitor: %w", err)
	}

	if op != controllerutil.OperationResultNone {
		log.Info("ServiceMonitor reconciled", "operation", op)
	}
	return nil
}

This function is also idempotent. If monitoring.enabled is false, it ensures the ServiceMonitor is deleted. If true, it creates or updates the ServiceMonitor to match the VaultApp‘s Spec.

The Final Result: A Unified Application Definition

With this implementation, our initial fragmented and messy deployment definition can now be replaced by a single, cohesive VaultApp resource:

apiVersion: app.techweaver.io/v1alpha1
kind: VaultApp
metadata:
  name: billing-service
  namespace: default
spec:
  # 1. Deployment definition is embedded
  deploymentSpec:
    replicas: 3
    selector:
      matchLabels:
        app: billing-service
    template:
      metadata:
        labels:
          app: billing-service
      spec:
        containers:
        - name: billing-service
          image: my-registry/billing-service:1.3.0
          ports:
          - name: http-metrics
            containerPort: 8081
          # No secrets in ENV, they will be mounted from the operator-managed volume
          # The application now reads secrets from /etc/secrets/vault/db_url
          # and /etc/secrets/vault/provider_x_api_key

  # 2. Vault integration is declaratively defined
  vault:
    role: billing-app # Vault role for this app's identity
    secrets:
      - path: "kv/data/billing/database"
        key: "url"
        targetKey: "db_url" # Filename in the final K8s Secret
      - path: "kv/data/billing/provider-x"
        key: "key"
        targetKey: "provider_x_api_key"

  # 3. Monitoring is part of the same resource
  monitoring:
    enabled: true
    port: http-metrics
    path: /actuator/prometheus

Now, developers only need to focus on this one file. They declare what their application needs, not how to get it. Secret rotation, deployment updates, and monitoring configuration are all handled automatically by the VaultApp Operator behind the scenes. This is the core value of platform engineering: reducing cognitive load and improving development and operational efficiency by building higher-level abstractions.

Limitations and Future Iterations

This operator implements a powerful pattern, but it’s not without its limitations. In a real-world project, it’s just a starting point.

  1. Graceful Secret Rotation: While the current rolling update strategy is effective, it can be too disruptive for some applications. A more advanced solution would be to integrate the Vault Agent Sidecar pattern. The operator would be responsible for injecting the sidecar, and the application would read secrets from the sidecar via the local filesystem or an HTTP API, enabling hot-reloading without restarts.
  2. Error Handling and Observability: A production-grade operator needs more granular status reporting. The Status field should contain richer conditions that detail the state of each step in the reconciliation, such as VaultAuthSuccess, SecretsSynced, DeploymentReady, etc. The operator itself should also expose Prometheus metrics to monitor reconciliation latency, error rates, and more.
  3. Support for More Resource Types: Currently, we only manage Deployments, but real applications may also need Services, Ingresses, or StatefulSets. The operator could be extended into a complete application lifecycle manager.
  4. Testing Strategy: Testing an operator is crucial. Beyond unit tests, using an integration testing framework like envtest to test the reconciliation logic against an in-memory control plane is key to ensuring its stability.

The boundaries of this pattern’s applicability are also clear. It’s best suited for teams that want to provide a standardized, automated platform for their internal developers. For small projects or one-off deployments, the upfront investment might be too high. However, as an organization scales, this approach of encapsulating operational complexity within a custom Kubernetes API will yield massive returns.


  TOC