RudderVirt

#Troubleshooting

#Build stuck in Networking

Network resources couldn't be created. Check that:

  • Your CIDRs don't overlap with existing reserved ranges (the egress subnet is typically 172.17.0.0/16).
  • VPC and subnet names are valid DNS labels (lowercase letters, digits, hyphens).

#Build stuck in Booting, never reaches Provisioning

The VM booted but SSH never came up. Common causes:

  • Cloud image without a cloudInit block: cloud-init never runs, no SSH key gets injected.
  • Wrong sshUsername for the distro (ubuntu vs. debian vs. cloud-user).
  • For ISO installs: the boot command didn't drive the installer correctly. Watch the VNC console through the UI to see what's actually on screen.
  • For Windows: Autounattend.xml didn't run. Open VNC during the build to see if Setup is asking interactive questions.
  • sshTimeout is shorter than the install actually takes. Bump it up.

#Provisioner fails with permission errors

Linux: many cloud-image users don't have passwordless sudo. The build assumes they do, but distro defaults vary. If sudo prompts for a password:

shell:
    executeCommand: "echo '{{password}}' | sudo -S sh -c '{{ .Command }}'"
    inline: |
        apt-get install -y nginx

Or set up passwordless sudo in your preseed/cloud-init.

Windows: PowerShell execution policy can block scripts. Set it explicitly:

Set-ExecutionPolicy -ExecutionPolicy Unrestricted -Force

#windows-update step times out or loops

  • Some updates require a reboot before others become eligible. The provisioner handles this with reboots between cycles, but very large update queues can take many cycles.
  • Bump timeout to several hours.
  • If specific updates are failing, use filters to exclude them.

#Boot command keystrokes happen too early or too late

Insert <wait> or <waitN> at strategic points. ISO installers are timing-sensitive. When in doubt, watch the build via VNC and time the boot loader yourself.

#Build runs but a cloned VM won't boot

  • Is the disk too small for what you installed? Each template VM's disk inherits disks[].size.
  • Did you forget to install virtio drivers (Windows)? Without netkvm/vioscsi, clones fail to boot or fail to network.
  • Did you change NIC model between build and clone without installing the matching driver?
  • Did you cleanly shut down? End each VM's provisioners with a reboot so the captured disk isn't in a half-flushed state.

#Inspecting a stuck build

  • Open the build in the UI to see its current phase, conditions, and any error message.
  • Watch recent events / log output for the build.
  • Open the affected VM's VNC console through the UI to see what's on screen during a hang.
  • Insert a handbuild provisioner at the suspect step to pause the build and inspect interactively.