Files
hyperv-demo/CLAUDE.md

12 KiB

Hyper-V Windows Server Automation - Project Documentation

Project Overview

This project demonstrates enterprise-grade automation for the complete lifecycle of Windows Server VMs on Hyper-V using Ansible Automation Platform (AAP), implementing GitOps and Infrastructure as Code (IaC) principles.

Target Audience: Enterprise IT operations teams, infrastructure engineers, platform engineers

Deployment Platform: Ansible Automation Platform 2.x (formerly Ansible Tower)

Future Roadmap: Event-Driven Ansible (EDA) integration for reactive automation

Architecture

Technology Stack

  • Automation Engine: Ansible Core 2.15+
  • Platform: Ansible Automation Platform 2.4+
  • Hypervisor: Microsoft Hyper-V (Windows Server 2019/2022) - hyperv1.lan.toal.ca
  • Guest OS: Windows Server 2019/2022
  • CMDB: ServiceNow ITSM
  • Version Control: Git (GitOps workflow)
  • Authentication: Active Directory / Kerberos
  • Inventory: ToalLab standard inventory (/home/ptoal/Dev/inventories/toallab-inventory)

Connectivity Model

Ansible Automation Platform
    ↓ (WinRM over HTTPS/Kerberos)
Windows Hyper-V Host(s)
    ↓ (Hyper-V PowerShell)
Windows Server VMs
    ↓ (REST API)
ServiceNow CMDB

Core Use Cases

  1. VM Provisioning: Automated creation of Windows Server VMs using unattended installation (autounattend.xml)
  2. Patch Management: Automated Windows Update deployment triggered by git commits
  3. Application Deployment: Install and configure applications (IIS demonstration)
  4. Configuration Management: Day-2 operations and drift remediation
  5. CMDB Synchronization: Bidirectional sync with ServiceNow CMDB

Project Structure

.
├── ansible.cfg                      # Ansible configuration with Windows/WinRM defaults
├── collections/
│   └── requirements.yml             # Required Ansible collections
├── inventory/                       # DEPRECATED: Using toallab-inventory
│   ├── production/                  # Legacy - for reference only
│   │   └── hosts.yml                
│   └── staging/
│       └── hosts.yml                # Staging inventory (future)
├── group_vars/                      # DEPRECATED: Moved to toallab-inventory
│   ├── all.yml                      # Legacy - for reference only
│   ├── hyperv_hosts.yml             
│   ├── windows_servers.yml          
│   └── web_servers.yml              
├── host_vars/                       # DEPRECATED: Moved to toallab-inventory
├── playbooks/
│   ├── provision-vm.yml             # VM provisioning workflow
│   ├── patch-vms.yml                # Windows Update automation
│   ├── install-iis.yml              # IIS deployment
│   └── sync-cmdb.yml                # ServiceNow CMDB sync
├── roles/                           # Custom roles (future development)
│   ├── windows_baseline/            # Windows hardening & baseline config
│   ├── hyperv_vm/                   # Hyper-V VM management
│   ├── iis_webapp/                  # IIS application deployment
│   └── servicenow_sync/             # ServiceNow integration
├── templates/                       # Jinja2 templates (future)
│   └── autounattend.xml.j2          # Windows unattended install template
└── README.md                        # Quick start guide

Key Design Patterns

GitOps Workflow

All infrastructure changes flow through Git:

  1. Engineer creates feature branch
  2. Updates inventory or group_vars to define desired state
  3. Commits and creates pull request
  4. AAP webhook triggers job template for validation
  5. After merge, AAP webhook triggers deployment

Idempotency

All playbooks must be idempotent - safe to run multiple times without side effects. Use:

  • state: present vs state: absent
  • Conditional tasks with when:
  • Changed/failed handlers
  • Check mode support (--check)

Credential Management

  • Never commit secrets to Git
  • Use AAP credential types for:
    • Machine credentials (WinRM)
    • ServiceNow credentials
    • Domain join credentials
  • Use Ansible Vault for sensitive variables in development

Role-Based Organization

Future development should extract common patterns into roles:

  • roles/windows_baseline/: Base Windows configuration
  • roles/hyperv_vm/: VM lifecycle management
  • roles/iis_webapp/: IIS deployment patterns

Technical Requirements

Prerequisites

  1. Ansible Automation Platform

    • AAP 2.4 or later
    • Controller configured with Windows machine credentials
    • Execution environment with Windows collections
  2. Hyper-V Environment

    • Windows Server 2019/2022 with Hyper-V role
    • WinRM enabled and configured
    • Kerberos authentication configured
    • Sufficient storage for VM images
  3. Network Requirements

    • WinRM ports (5985/5986) open from AAP to Hyper-V hosts
    • WinRM ports open from AAP to managed Windows VMs
    • DNS resolution for all hosts
    • Active Directory domain membership
  4. ServiceNow

    • ServiceNow instance with CMDB
    • API user credentials
    • CMDB table structure defined

Windows Remote Management Setup

On all Windows hosts (Hyper-V and VMs):

# Enable WinRM with HTTPS
winrm quickconfig -transport:https
winrm set winrm/config/service/auth '@{Kerberos="true"}'
winrm set winrm/config/service '@{AllowUnencrypted="false"}'

Development Guidelines

Adding New Playbooks

  1. Create playbook in playbooks/ directory
  2. Use descriptive names: verb-noun.yml (e.g., deploy-webapp.yml)
  3. Include proper documentation in header
  4. Add tags for selective execution
  5. Implement check mode support
  6. Test in staging environment first

Inventory Configuration

This project uses the ToalLab standard inventory located at /home/ptoal/Dev/inventories/toallab-inventory.

Inventory Groups:

  • hyperv - Hyper-V hosts (hyperv1.lan.toal.ca)
  • windows_servers - All Windows Server VMs
    • web_servers - IIS/web servers
    • app_servers - Application servers
    • db_servers - Database servers

Group Variables:

  • /home/ptoal/Dev/inventories/toallab-inventory/group_vars/hyperv/vars.yml - Hyper-V defaults
  • /home/ptoal/Dev/inventories/toallab-inventory/group_vars/windows_servers/vars.yml - Windows defaults

Host Variables:

  • /home/ptoal/Dev/inventories/toallab-inventory/host_vars/hyperv1.lan.toal.ca/vars.yml - Hypervisor config

Variable Precedence

Follow this hierarchy (least to most specific):

  1. group_vars/all.yml - Global defaults (in toallab-inventory)
  2. group_vars/<group>.yml - Group-specific (in toallab-inventory)
  3. host_vars/<host>.yml - Host-specific (in toallab-inventory)
  4. Playbook vars: - Playbook overrides
  5. Extra vars (-e) - Runtime overrides

Testing Strategy

  1. Syntax Check: ansible-playbook --syntax-check playbook.yml
  2. Check Mode: ansible-playbook --check playbook.yml
  3. Limit Scope: --limit to test on single host first
  4. Verbose Output: Use -v, -vv, -vvv for debugging
  5. Staging First: Always test in staging before production

Windows Module Best Practices

  • Use ansible.windows.* modules (not deprecated win_*)
  • Always handle reboots explicitly with ansible.windows.win_reboot
  • Use register: to capture task output
  • Check reboot_required in results
  • Use failed_when: for expected error conditions

AAP Integration

Job Templates to Create

  1. Provision VM - playbooks/provision-vm.yml

    • Survey for VM name, IP, CPU, RAM
    • Credential: Hyper-V machine credential
    • Webhook enabled for GitOps
  2. Patch VMs - playbooks/patch-vms.yml

    • Limit pattern for selective patching
    • Scheduled for maintenance windows
    • Credential: Windows machine credential
  3. Deploy IIS - playbooks/install-iis.yml

    • Limit to web_servers group
    • Credential: Windows machine credential
  4. Sync CMDB - playbooks/sync-cmdb.yml

    • Scheduled daily
    • Credentials: Windows + ServiceNow

Workflow Templates (Future)

Create workflows for complex orchestration:

  • Full VM Lifecycle: Provision → Configure → Deploy App → Update CMDB
  • Patch & Compliance: Patch → Verify → Update CMDB → Generate Report

Event-Driven Ansible (Future)

Planned EDA integrations:

  • ServiceNow incident triggers remediation playbook
  • Windows Event Log monitoring triggers security response
  • Hyper-V alerts trigger capacity management
  • Git webhook triggers deployment pipeline

Common Tasks

Bootstrap a New VM

# Provision VM
ansible-playbook playbooks/provision-vm.yml \
  -e vm_name=DEMO-WEB01 \
  -e vm_ip_address=192.168.1.101

# Configure baseline
ansible-playbook playbooks/windows-baseline.yml --limit DEMO-WEB01

# Deploy application
ansible-playbook playbooks/install-iis.yml --limit DEMO-WEB01

# Update CMDB
ansible-playbook playbooks/sync-cmdb.yml --limit DEMO-WEB01

Patch All Windows Servers

ansible-playbook playbooks/patch-vms.yml --limit windows_servers

Update Specific Group

ansible-playbook playbooks/patch-vms.yml --limit web_servers

Troubleshooting

WinRM Connection Issues

# Test WinRM connectivity
ansible hyperv_hosts -m ansible.windows.win_ping

# Check with verbose output
ansible hyperv_hosts -m ansible.windows.win_ping -vvv

Common Issues

  1. Kerberos Authentication Failure

    • Verify DNS resolution (forward and reverse)
    • Check domain join status
    • Verify time synchronization
    • Check Kerberos ticket: klist
  2. Module Not Found

    • Install collections: ansible-galaxy collection install -r collections/requirements.yml
    • Verify in AAP execution environment
  3. Timeout Issues

    • Increase timeout in ansible.cfg
    • Check network connectivity
    • Verify WinRM service running

Security Considerations

Credential Storage

  • Use AAP credential vault (not Ansible Vault in production)
  • Rotate credentials regularly
  • Use least-privilege service accounts
  • Separate credentials per environment

Network Security

  • Use WinRM over HTTPS (port 5986)
  • Enable Kerberos encryption
  • Implement network segmentation
  • Use jump hosts/bastion for AAP

Compliance

  • Enable audit logging in AAP
  • Log all playbook runs
  • Track changes in ServiceNow CMDB
  • Implement change approval workflow

Future Enhancements

Phase 2 - Advanced Features

  • Custom execution environment with all dependencies
  • Ansible Vault integration for secrets
  • Enhanced autounattend.xml templating
  • VM template/image management
  • Backup and DR automation

Phase 3 - EDA Integration

  • ServiceNow incident-driven remediation
  • Windows Event Log monitoring
  • Hyper-V performance monitoring
  • Self-healing automation

Phase 4 - Enterprise Scale

  • Multi-region Hyper-V clusters
  • RBAC and delegation model
  • Compliance scanning and remediation
  • Cost tracking and optimization
  • Disaster recovery automation

Contributing

This is a demonstration project. When extending:

  1. Follow existing patterns and structure
  2. Test thoroughly in staging
  3. Document all variables in group_vars
  4. Use semantic versioning for releases
  5. Update this CLAUDE.md with architectural changes

References