347 lines
11 KiB
Markdown
347 lines
11 KiB
Markdown
# Hyper-V Windows Server Automation - Project Documentation
|
|
|
|
## Project Overview
|
|
|
|
This project demonstrates enterprise-grade automation for the complete lifecycle of Windows Server VMs on Hyper-V using Ansible Automation Platform (AAP), implementing GitOps and Infrastructure as Code (IaC) principles.
|
|
|
|
**Target Audience**: Enterprise IT operations teams, infrastructure engineers, platform engineers
|
|
|
|
**Deployment Platform**: Ansible Automation Platform 2.x (formerly Ansible Tower)
|
|
|
|
**Future Roadmap**: Event-Driven Ansible (EDA) integration for reactive automation
|
|
|
|
## Architecture
|
|
|
|
### Technology Stack
|
|
|
|
- **Automation Engine**: Ansible Core 2.15+
|
|
- **Platform**: Ansible Automation Platform 2.4+
|
|
- **Hypervisor**: Microsoft Hyper-V (Windows Server 2019/2022)
|
|
- **Guest OS**: Windows Server 2019/2022
|
|
- **CMDB**: ServiceNow ITSM
|
|
- **Version Control**: Git (GitOps workflow)
|
|
- **Authentication**: Active Directory / Kerberos
|
|
|
|
### Connectivity Model
|
|
|
|
```
|
|
Ansible Automation Platform
|
|
↓ (WinRM over HTTPS/Kerberos)
|
|
Windows Hyper-V Host(s)
|
|
↓ (Hyper-V PowerShell)
|
|
Windows Server VMs
|
|
↓ (REST API)
|
|
ServiceNow CMDB
|
|
```
|
|
|
|
### Core Use Cases
|
|
|
|
1. **VM Provisioning**: Automated creation of Windows Server VMs using unattended installation (autounattend.xml)
|
|
2. **Patch Management**: Automated Windows Update deployment triggered by git commits
|
|
3. **Application Deployment**: Install and configure applications (IIS demonstration)
|
|
4. **Configuration Management**: Day-2 operations and drift remediation
|
|
5. **CMDB Synchronization**: Bidirectional sync with ServiceNow CMDB
|
|
|
|
## Project Structure
|
|
|
|
```
|
|
.
|
|
├── ansible.cfg # Ansible configuration with Windows/WinRM defaults
|
|
├── collections/
|
|
│ └── requirements.yml # Required Ansible collections
|
|
├── inventory/
|
|
│ ├── production/
|
|
│ │ └── hosts.yml # Production inventory
|
|
│ └── staging/
|
|
│ └── hosts.yml # Staging inventory (future)
|
|
├── group_vars/
|
|
│ ├── all.yml # Global variables
|
|
│ ├── hyperv_hosts.yml # Hyper-V host configuration
|
|
│ ├── windows_servers.yml # Windows Server defaults
|
|
│ └── web_servers.yml # IIS/web server configuration
|
|
├── host_vars/ # Host-specific variables (future)
|
|
├── playbooks/
|
|
│ ├── provision-vm.yml # VM provisioning workflow
|
|
│ ├── patch-vms.yml # Windows Update automation
|
|
│ ├── install-iis.yml # IIS deployment
|
|
│ └── sync-cmdb.yml # ServiceNow CMDB sync
|
|
├── roles/ # Custom roles (future development)
|
|
│ ├── windows_baseline/ # Windows hardening & baseline config
|
|
│ ├── hyperv_vm/ # Hyper-V VM management
|
|
│ ├── iis_webapp/ # IIS application deployment
|
|
│ └── servicenow_sync/ # ServiceNow integration
|
|
├── templates/ # Jinja2 templates (future)
|
|
│ └── autounattend.xml.j2 # Windows unattended install template
|
|
└── README.md # Quick start guide
|
|
```
|
|
|
|
## Key Design Patterns
|
|
|
|
### GitOps Workflow
|
|
|
|
All infrastructure changes flow through Git:
|
|
1. Engineer creates feature branch
|
|
2. Updates inventory or group_vars to define desired state
|
|
3. Commits and creates pull request
|
|
4. AAP webhook triggers job template for validation
|
|
5. After merge, AAP webhook triggers deployment
|
|
|
|
### Idempotency
|
|
|
|
All playbooks must be idempotent - safe to run multiple times without side effects. Use:
|
|
- `state: present` vs `state: absent`
|
|
- Conditional tasks with `when:`
|
|
- Changed/failed handlers
|
|
- Check mode support (`--check`)
|
|
|
|
### Credential Management
|
|
|
|
- **Never commit secrets to Git**
|
|
- Use AAP credential types for:
|
|
- Machine credentials (WinRM)
|
|
- ServiceNow credentials
|
|
- Domain join credentials
|
|
- Use Ansible Vault for sensitive variables in development
|
|
|
|
### Role-Based Organization
|
|
|
|
Future development should extract common patterns into roles:
|
|
- `roles/windows_baseline/`: Base Windows configuration
|
|
- `roles/hyperv_vm/`: VM lifecycle management
|
|
- `roles/iis_webapp/`: IIS deployment patterns
|
|
|
|
## Technical Requirements
|
|
|
|
### Prerequisites
|
|
|
|
1. **Ansible Automation Platform**
|
|
- AAP 2.4 or later
|
|
- Controller configured with Windows machine credentials
|
|
- Execution environment with Windows collections
|
|
|
|
2. **Hyper-V Environment**
|
|
- Windows Server 2019/2022 with Hyper-V role
|
|
- WinRM enabled and configured
|
|
- Kerberos authentication configured
|
|
- Sufficient storage for VM images
|
|
|
|
3. **Network Requirements**
|
|
- WinRM ports (5985/5986) open from AAP to Hyper-V hosts
|
|
- WinRM ports open from AAP to managed Windows VMs
|
|
- DNS resolution for all hosts
|
|
- Active Directory domain membership
|
|
|
|
4. **ServiceNow**
|
|
- ServiceNow instance with CMDB
|
|
- API user credentials
|
|
- CMDB table structure defined
|
|
|
|
### Windows Remote Management Setup
|
|
|
|
On all Windows hosts (Hyper-V and VMs):
|
|
|
|
```powershell
|
|
# Enable WinRM with HTTPS
|
|
winrm quickconfig -transport:https
|
|
winrm set winrm/config/service/auth '@{Kerberos="true"}'
|
|
winrm set winrm/config/service '@{AllowUnencrypted="false"}'
|
|
```
|
|
|
|
## Development Guidelines
|
|
|
|
### Adding New Playbooks
|
|
|
|
1. Create playbook in `playbooks/` directory
|
|
2. Use descriptive names: `verb-noun.yml` (e.g., `deploy-webapp.yml`)
|
|
3. Include proper documentation in header
|
|
4. Add tags for selective execution
|
|
5. Implement check mode support
|
|
6. Test in staging environment first
|
|
|
|
### Variable Precedence
|
|
|
|
Follow this hierarchy (least to most specific):
|
|
1. `group_vars/all.yml` - Global defaults
|
|
2. `group_vars/<group>.yml` - Group-specific
|
|
3. `host_vars/<host>.yml` - Host-specific
|
|
4. Playbook `vars:` - Playbook overrides
|
|
5. Extra vars (`-e`) - Runtime overrides
|
|
|
|
### Testing Strategy
|
|
|
|
1. **Syntax Check**: `ansible-playbook --syntax-check playbook.yml`
|
|
2. **Check Mode**: `ansible-playbook --check playbook.yml`
|
|
3. **Limit Scope**: `--limit` to test on single host first
|
|
4. **Verbose Output**: Use `-v`, `-vv`, `-vvv` for debugging
|
|
5. **Staging First**: Always test in staging before production
|
|
|
|
### Windows Module Best Practices
|
|
|
|
- Use `ansible.windows.*` modules (not deprecated `win_*`)
|
|
- Always handle reboots explicitly with `ansible.windows.win_reboot`
|
|
- Use `register:` to capture task output
|
|
- Check `reboot_required` in results
|
|
- Use `failed_when:` for expected error conditions
|
|
|
|
## AAP Integration
|
|
|
|
### Job Templates to Create
|
|
|
|
1. **Provision VM** - `playbooks/provision-vm.yml`
|
|
- Survey for VM name, IP, CPU, RAM
|
|
- Credential: Hyper-V machine credential
|
|
- Webhook enabled for GitOps
|
|
|
|
2. **Patch VMs** - `playbooks/patch-vms.yml`
|
|
- Limit pattern for selective patching
|
|
- Scheduled for maintenance windows
|
|
- Credential: Windows machine credential
|
|
|
|
3. **Deploy IIS** - `playbooks/install-iis.yml`
|
|
- Limit to web_servers group
|
|
- Credential: Windows machine credential
|
|
|
|
4. **Sync CMDB** - `playbooks/sync-cmdb.yml`
|
|
- Scheduled daily
|
|
- Credentials: Windows + ServiceNow
|
|
|
|
### Workflow Templates (Future)
|
|
|
|
Create workflows for complex orchestration:
|
|
- **Full VM Lifecycle**: Provision → Configure → Deploy App → Update CMDB
|
|
- **Patch & Compliance**: Patch → Verify → Update CMDB → Generate Report
|
|
|
|
### Event-Driven Ansible (Future)
|
|
|
|
Planned EDA integrations:
|
|
- ServiceNow incident triggers remediation playbook
|
|
- Windows Event Log monitoring triggers security response
|
|
- Hyper-V alerts trigger capacity management
|
|
- Git webhook triggers deployment pipeline
|
|
|
|
## Common Tasks
|
|
|
|
### Bootstrap a New VM
|
|
|
|
```bash
|
|
# Provision VM
|
|
ansible-playbook playbooks/provision-vm.yml \
|
|
-e vm_name=DEMO-WEB01 \
|
|
-e vm_ip_address=192.168.1.101
|
|
|
|
# Configure baseline
|
|
ansible-playbook playbooks/windows-baseline.yml --limit DEMO-WEB01
|
|
|
|
# Deploy application
|
|
ansible-playbook playbooks/install-iis.yml --limit DEMO-WEB01
|
|
|
|
# Update CMDB
|
|
ansible-playbook playbooks/sync-cmdb.yml --limit DEMO-WEB01
|
|
```
|
|
|
|
### Patch All Windows Servers
|
|
|
|
```bash
|
|
ansible-playbook playbooks/patch-vms.yml --limit windows_servers
|
|
```
|
|
|
|
### Update Specific Group
|
|
|
|
```bash
|
|
ansible-playbook playbooks/patch-vms.yml --limit web_servers
|
|
```
|
|
|
|
## Troubleshooting
|
|
|
|
### WinRM Connection Issues
|
|
|
|
```bash
|
|
# Test WinRM connectivity
|
|
ansible hyperv_hosts -m ansible.windows.win_ping
|
|
|
|
# Check with verbose output
|
|
ansible hyperv_hosts -m ansible.windows.win_ping -vvv
|
|
```
|
|
|
|
### Common Issues
|
|
|
|
1. **Kerberos Authentication Failure**
|
|
- Verify DNS resolution (forward and reverse)
|
|
- Check domain join status
|
|
- Verify time synchronization
|
|
- Check Kerberos ticket: `klist`
|
|
|
|
2. **Module Not Found**
|
|
- Install collections: `ansible-galaxy collection install -r collections/requirements.yml`
|
|
- Verify in AAP execution environment
|
|
|
|
3. **Timeout Issues**
|
|
- Increase timeout in `ansible.cfg`
|
|
- Check network connectivity
|
|
- Verify WinRM service running
|
|
|
|
## Security Considerations
|
|
|
|
### Credential Storage
|
|
|
|
- Use AAP credential vault (not Ansible Vault in production)
|
|
- Rotate credentials regularly
|
|
- Use least-privilege service accounts
|
|
- Separate credentials per environment
|
|
|
|
### Network Security
|
|
|
|
- Use WinRM over HTTPS (port 5986)
|
|
- Enable Kerberos encryption
|
|
- Implement network segmentation
|
|
- Use jump hosts/bastion for AAP
|
|
|
|
### Compliance
|
|
|
|
- Enable audit logging in AAP
|
|
- Log all playbook runs
|
|
- Track changes in ServiceNow CMDB
|
|
- Implement change approval workflow
|
|
|
|
## Future Enhancements
|
|
|
|
### Phase 2 - Advanced Features
|
|
|
|
- [ ] Custom execution environment with all dependencies
|
|
- [ ] Ansible Vault integration for secrets
|
|
- [ ] Enhanced autounattend.xml templating
|
|
- [ ] VM template/image management
|
|
- [ ] Backup and DR automation
|
|
|
|
### Phase 3 - EDA Integration
|
|
|
|
- [ ] ServiceNow incident-driven remediation
|
|
- [ ] Windows Event Log monitoring
|
|
- [ ] Hyper-V performance monitoring
|
|
- [ ] Self-healing automation
|
|
|
|
### Phase 4 - Enterprise Scale
|
|
|
|
- [ ] Multi-region Hyper-V clusters
|
|
- [ ] RBAC and delegation model
|
|
- [ ] Compliance scanning and remediation
|
|
- [ ] Cost tracking and optimization
|
|
- [ ] Disaster recovery automation
|
|
|
|
## Contributing
|
|
|
|
This is a demonstration project. When extending:
|
|
|
|
1. Follow existing patterns and structure
|
|
2. Test thoroughly in staging
|
|
3. Document all variables in group_vars
|
|
4. Use semantic versioning for releases
|
|
5. Update this CLAUDE.md with architectural changes
|
|
|
|
## References
|
|
|
|
- [Ansible Windows Guide](https://docs.ansible.com/ansible/latest/os_guide/windows_usage.html)
|
|
- [Ansible Automation Platform Docs](https://access.redhat.com/documentation/en-us/red_hat_ansible_automation_platform)
|
|
- [ServiceNow ITSM Collection](https://github.com/ansible-collections/servicenow.itsm)
|
|
- [Event-Driven Ansible](https://www.ansible.com/products/event-driven-ansible)
|