Initial hyper-v demo skeleton
This commit is contained in:
346
CLAUDE.md
Normal file
346
CLAUDE.md
Normal file
@@ -0,0 +1,346 @@
|
||||
# Hyper-V Windows Server Automation - Project Documentation
|
||||
|
||||
## Project Overview
|
||||
|
||||
This project demonstrates enterprise-grade automation for the complete lifecycle of Windows Server VMs on Hyper-V using Ansible Automation Platform (AAP), implementing GitOps and Infrastructure as Code (IaC) principles.
|
||||
|
||||
**Target Audience**: Enterprise IT operations teams, infrastructure engineers, platform engineers
|
||||
|
||||
**Deployment Platform**: Ansible Automation Platform 2.x (formerly Ansible Tower)
|
||||
|
||||
**Future Roadmap**: Event-Driven Ansible (EDA) integration for reactive automation
|
||||
|
||||
## Architecture
|
||||
|
||||
### Technology Stack
|
||||
|
||||
- **Automation Engine**: Ansible Core 2.15+
|
||||
- **Platform**: Ansible Automation Platform 2.4+
|
||||
- **Hypervisor**: Microsoft Hyper-V (Windows Server 2019/2022)
|
||||
- **Guest OS**: Windows Server 2019/2022
|
||||
- **CMDB**: ServiceNow ITSM
|
||||
- **Version Control**: Git (GitOps workflow)
|
||||
- **Authentication**: Active Directory / Kerberos
|
||||
|
||||
### Connectivity Model
|
||||
|
||||
```
|
||||
Ansible Automation Platform
|
||||
↓ (WinRM over HTTPS/Kerberos)
|
||||
Windows Hyper-V Host(s)
|
||||
↓ (Hyper-V PowerShell)
|
||||
Windows Server VMs
|
||||
↓ (REST API)
|
||||
ServiceNow CMDB
|
||||
```
|
||||
|
||||
### Core Use Cases
|
||||
|
||||
1. **VM Provisioning**: Automated creation of Windows Server VMs using unattended installation (autounattend.xml)
|
||||
2. **Patch Management**: Automated Windows Update deployment triggered by git commits
|
||||
3. **Application Deployment**: Install and configure applications (IIS demonstration)
|
||||
4. **Configuration Management**: Day-2 operations and drift remediation
|
||||
5. **CMDB Synchronization**: Bidirectional sync with ServiceNow CMDB
|
||||
|
||||
## Project Structure
|
||||
|
||||
```
|
||||
.
|
||||
├── ansible.cfg # Ansible configuration with Windows/WinRM defaults
|
||||
├── collections/
|
||||
│ └── requirements.yml # Required Ansible collections
|
||||
├── inventory/
|
||||
│ ├── production/
|
||||
│ │ └── hosts.yml # Production inventory
|
||||
│ └── staging/
|
||||
│ └── hosts.yml # Staging inventory (future)
|
||||
├── group_vars/
|
||||
│ ├── all.yml # Global variables
|
||||
│ ├── hyperv_hosts.yml # Hyper-V host configuration
|
||||
│ ├── windows_servers.yml # Windows Server defaults
|
||||
│ └── web_servers.yml # IIS/web server configuration
|
||||
├── host_vars/ # Host-specific variables (future)
|
||||
├── playbooks/
|
||||
│ ├── provision-vm.yml # VM provisioning workflow
|
||||
│ ├── patch-vms.yml # Windows Update automation
|
||||
│ ├── install-iis.yml # IIS deployment
|
||||
│ └── sync-cmdb.yml # ServiceNow CMDB sync
|
||||
├── roles/ # Custom roles (future development)
|
||||
│ ├── windows_baseline/ # Windows hardening & baseline config
|
||||
│ ├── hyperv_vm/ # Hyper-V VM management
|
||||
│ ├── iis_webapp/ # IIS application deployment
|
||||
│ └── servicenow_sync/ # ServiceNow integration
|
||||
├── templates/ # Jinja2 templates (future)
|
||||
│ └── autounattend.xml.j2 # Windows unattended install template
|
||||
└── README.md # Quick start guide
|
||||
```
|
||||
|
||||
## Key Design Patterns
|
||||
|
||||
### GitOps Workflow
|
||||
|
||||
All infrastructure changes flow through Git:
|
||||
1. Engineer creates feature branch
|
||||
2. Updates inventory or group_vars to define desired state
|
||||
3. Commits and creates pull request
|
||||
4. AAP webhook triggers job template for validation
|
||||
5. After merge, AAP webhook triggers deployment
|
||||
|
||||
### Idempotency
|
||||
|
||||
All playbooks must be idempotent - safe to run multiple times without side effects. Use:
|
||||
- `state: present` vs `state: absent`
|
||||
- Conditional tasks with `when:`
|
||||
- Changed/failed handlers
|
||||
- Check mode support (`--check`)
|
||||
|
||||
### Credential Management
|
||||
|
||||
- **Never commit secrets to Git**
|
||||
- Use AAP credential types for:
|
||||
- Machine credentials (WinRM)
|
||||
- ServiceNow credentials
|
||||
- Domain join credentials
|
||||
- Use Ansible Vault for sensitive variables in development
|
||||
|
||||
### Role-Based Organization
|
||||
|
||||
Future development should extract common patterns into roles:
|
||||
- `roles/windows_baseline/`: Base Windows configuration
|
||||
- `roles/hyperv_vm/`: VM lifecycle management
|
||||
- `roles/iis_webapp/`: IIS deployment patterns
|
||||
|
||||
## Technical Requirements
|
||||
|
||||
### Prerequisites
|
||||
|
||||
1. **Ansible Automation Platform**
|
||||
- AAP 2.4 or later
|
||||
- Controller configured with Windows machine credentials
|
||||
- Execution environment with Windows collections
|
||||
|
||||
2. **Hyper-V Environment**
|
||||
- Windows Server 2019/2022 with Hyper-V role
|
||||
- WinRM enabled and configured
|
||||
- Kerberos authentication configured
|
||||
- Sufficient storage for VM images
|
||||
|
||||
3. **Network Requirements**
|
||||
- WinRM ports (5985/5986) open from AAP to Hyper-V hosts
|
||||
- WinRM ports open from AAP to managed Windows VMs
|
||||
- DNS resolution for all hosts
|
||||
- Active Directory domain membership
|
||||
|
||||
4. **ServiceNow**
|
||||
- ServiceNow instance with CMDB
|
||||
- API user credentials
|
||||
- CMDB table structure defined
|
||||
|
||||
### Windows Remote Management Setup
|
||||
|
||||
On all Windows hosts (Hyper-V and VMs):
|
||||
|
||||
```powershell
|
||||
# Enable WinRM with HTTPS
|
||||
winrm quickconfig -transport:https
|
||||
winrm set winrm/config/service/auth '@{Kerberos="true"}'
|
||||
winrm set winrm/config/service '@{AllowUnencrypted="false"}'
|
||||
```
|
||||
|
||||
## Development Guidelines
|
||||
|
||||
### Adding New Playbooks
|
||||
|
||||
1. Create playbook in `playbooks/` directory
|
||||
2. Use descriptive names: `verb-noun.yml` (e.g., `deploy-webapp.yml`)
|
||||
3. Include proper documentation in header
|
||||
4. Add tags for selective execution
|
||||
5. Implement check mode support
|
||||
6. Test in staging environment first
|
||||
|
||||
### Variable Precedence
|
||||
|
||||
Follow this hierarchy (least to most specific):
|
||||
1. `group_vars/all.yml` - Global defaults
|
||||
2. `group_vars/<group>.yml` - Group-specific
|
||||
3. `host_vars/<host>.yml` - Host-specific
|
||||
4. Playbook `vars:` - Playbook overrides
|
||||
5. Extra vars (`-e`) - Runtime overrides
|
||||
|
||||
### Testing Strategy
|
||||
|
||||
1. **Syntax Check**: `ansible-playbook --syntax-check playbook.yml`
|
||||
2. **Check Mode**: `ansible-playbook --check playbook.yml`
|
||||
3. **Limit Scope**: `--limit` to test on single host first
|
||||
4. **Verbose Output**: Use `-v`, `-vv`, `-vvv` for debugging
|
||||
5. **Staging First**: Always test in staging before production
|
||||
|
||||
### Windows Module Best Practices
|
||||
|
||||
- Use `ansible.windows.*` modules (not deprecated `win_*`)
|
||||
- Always handle reboots explicitly with `ansible.windows.win_reboot`
|
||||
- Use `register:` to capture task output
|
||||
- Check `reboot_required` in results
|
||||
- Use `failed_when:` for expected error conditions
|
||||
|
||||
## AAP Integration
|
||||
|
||||
### Job Templates to Create
|
||||
|
||||
1. **Provision VM** - `playbooks/provision-vm.yml`
|
||||
- Survey for VM name, IP, CPU, RAM
|
||||
- Credential: Hyper-V machine credential
|
||||
- Webhook enabled for GitOps
|
||||
|
||||
2. **Patch VMs** - `playbooks/patch-vms.yml`
|
||||
- Limit pattern for selective patching
|
||||
- Scheduled for maintenance windows
|
||||
- Credential: Windows machine credential
|
||||
|
||||
3. **Deploy IIS** - `playbooks/install-iis.yml`
|
||||
- Limit to web_servers group
|
||||
- Credential: Windows machine credential
|
||||
|
||||
4. **Sync CMDB** - `playbooks/sync-cmdb.yml`
|
||||
- Scheduled daily
|
||||
- Credentials: Windows + ServiceNow
|
||||
|
||||
### Workflow Templates (Future)
|
||||
|
||||
Create workflows for complex orchestration:
|
||||
- **Full VM Lifecycle**: Provision → Configure → Deploy App → Update CMDB
|
||||
- **Patch & Compliance**: Patch → Verify → Update CMDB → Generate Report
|
||||
|
||||
### Event-Driven Ansible (Future)
|
||||
|
||||
Planned EDA integrations:
|
||||
- ServiceNow incident triggers remediation playbook
|
||||
- Windows Event Log monitoring triggers security response
|
||||
- Hyper-V alerts trigger capacity management
|
||||
- Git webhook triggers deployment pipeline
|
||||
|
||||
## Common Tasks
|
||||
|
||||
### Bootstrap a New VM
|
||||
|
||||
```bash
|
||||
# Provision VM
|
||||
ansible-playbook playbooks/provision-vm.yml \
|
||||
-e vm_name=DEMO-WEB01 \
|
||||
-e vm_ip_address=192.168.1.101
|
||||
|
||||
# Configure baseline
|
||||
ansible-playbook playbooks/windows-baseline.yml --limit DEMO-WEB01
|
||||
|
||||
# Deploy application
|
||||
ansible-playbook playbooks/install-iis.yml --limit DEMO-WEB01
|
||||
|
||||
# Update CMDB
|
||||
ansible-playbook playbooks/sync-cmdb.yml --limit DEMO-WEB01
|
||||
```
|
||||
|
||||
### Patch All Windows Servers
|
||||
|
||||
```bash
|
||||
ansible-playbook playbooks/patch-vms.yml --limit windows_servers
|
||||
```
|
||||
|
||||
### Update Specific Group
|
||||
|
||||
```bash
|
||||
ansible-playbook playbooks/patch-vms.yml --limit web_servers
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### WinRM Connection Issues
|
||||
|
||||
```bash
|
||||
# Test WinRM connectivity
|
||||
ansible hyperv_hosts -m ansible.windows.win_ping
|
||||
|
||||
# Check with verbose output
|
||||
ansible hyperv_hosts -m ansible.windows.win_ping -vvv
|
||||
```
|
||||
|
||||
### Common Issues
|
||||
|
||||
1. **Kerberos Authentication Failure**
|
||||
- Verify DNS resolution (forward and reverse)
|
||||
- Check domain join status
|
||||
- Verify time synchronization
|
||||
- Check Kerberos ticket: `klist`
|
||||
|
||||
2. **Module Not Found**
|
||||
- Install collections: `ansible-galaxy collection install -r collections/requirements.yml`
|
||||
- Verify in AAP execution environment
|
||||
|
||||
3. **Timeout Issues**
|
||||
- Increase timeout in `ansible.cfg`
|
||||
- Check network connectivity
|
||||
- Verify WinRM service running
|
||||
|
||||
## Security Considerations
|
||||
|
||||
### Credential Storage
|
||||
|
||||
- Use AAP credential vault (not Ansible Vault in production)
|
||||
- Rotate credentials regularly
|
||||
- Use least-privilege service accounts
|
||||
- Separate credentials per environment
|
||||
|
||||
### Network Security
|
||||
|
||||
- Use WinRM over HTTPS (port 5986)
|
||||
- Enable Kerberos encryption
|
||||
- Implement network segmentation
|
||||
- Use jump hosts/bastion for AAP
|
||||
|
||||
### Compliance
|
||||
|
||||
- Enable audit logging in AAP
|
||||
- Log all playbook runs
|
||||
- Track changes in ServiceNow CMDB
|
||||
- Implement change approval workflow
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
### Phase 2 - Advanced Features
|
||||
|
||||
- [ ] Custom execution environment with all dependencies
|
||||
- [ ] Ansible Vault integration for secrets
|
||||
- [ ] Enhanced autounattend.xml templating
|
||||
- [ ] VM template/image management
|
||||
- [ ] Backup and DR automation
|
||||
|
||||
### Phase 3 - EDA Integration
|
||||
|
||||
- [ ] ServiceNow incident-driven remediation
|
||||
- [ ] Windows Event Log monitoring
|
||||
- [ ] Hyper-V performance monitoring
|
||||
- [ ] Self-healing automation
|
||||
|
||||
### Phase 4 - Enterprise Scale
|
||||
|
||||
- [ ] Multi-region Hyper-V clusters
|
||||
- [ ] RBAC and delegation model
|
||||
- [ ] Compliance scanning and remediation
|
||||
- [ ] Cost tracking and optimization
|
||||
- [ ] Disaster recovery automation
|
||||
|
||||
## Contributing
|
||||
|
||||
This is a demonstration project. When extending:
|
||||
|
||||
1. Follow existing patterns and structure
|
||||
2. Test thoroughly in staging
|
||||
3. Document all variables in group_vars
|
||||
4. Use semantic versioning for releases
|
||||
5. Update this CLAUDE.md with architectural changes
|
||||
|
||||
## References
|
||||
|
||||
- [Ansible Windows Guide](https://docs.ansible.com/ansible/latest/os_guide/windows_usage.html)
|
||||
- [Ansible Automation Platform Docs](https://access.redhat.com/documentation/en-us/red_hat_ansible_automation_platform)
|
||||
- [ServiceNow ITSM Collection](https://github.com/ansible-collections/servicenow.itsm)
|
||||
- [Event-Driven Ansible](https://www.ansible.com/products/event-driven-ansible)
|
||||
Reference in New Issue
Block a user