Sargent & Lundy is a leading consulting engineering firm specializing in the power and energy sectors. Since 1891, we have provided comprehensive engineering, design, and consulting services for both traditional and renewable power generation, grid modernization, nuclear power, and beyond. Our mission is to help clients achieve their energy goals effectively by leveraging advanced technologies and adopting sustainable practices.
We are looking for a sharp, hands-on Infrastructure & Cloud Systems Engineer who thrives in both the data center and the cloud. You’ll own the reliability, performance, and security of our hybrid infrastructure — from bare-metal HPE servers and VMware clusters to Azure cloud environments — and drive the automation and engineering discipline that keeps everything running at peak efficiency. This is a high-impact role for someone who moves fast, thinks in systems, and takes ownership.
Key Responsibilities
On-Premises Data Center & Server Administration
- Administer and optimize physical and virtual server environments across on-premises data centers, including resource allocation (CPU, memory, storage), capacity planning, and lifecycle management.
- Manage VMware vSphere/vCenter environments — including ESXi host configuration, VM provisioning, cluster management, DRS/HA policies, and license allocation and optimization to control spend and maximize utilization.
- Perform regular hardware health monitoring, firmware patching, and hardware lifecycle tracking in coordination with vendor support.
- Own storage administration across SAN/NAS environments — LUN provisioning, volume management, performance tuning, and capacity forecasting.
- Maintain network fabric within the data center, including VLAN configuration, switching, and integration with enterprise routing and firewall infrastructure.
- Drive disaster recovery and business continuity for on-premises systems — design, execute, and document DR/failover tests on a defined schedule; validate RTO/RPO targets and remediate gaps.
Cloud Infrastructure (Azure & Hybrid)
- Design, deploy, and maintain Azure infrastructure — compute, networking, storage, identity, and platform services — in alignment with architecture standards and security baselines.
- Manage Azure AD/Entra ID, RBAC, Conditional Access, and identity governance for both cloud and hybrid environments.
- Implement and maintain cloud cost governance — right-sizing, reserved instance strategy, tagging hygiene, and license optimization across cloud resource pools.
- Build and maintain Infrastructure as Code using Terraform, ARM templates, or Bicep to ensure consistent, auditable, and repeatable provisioning.
- Develop and manage CI/CD pipelines for infrastructure deployment using Azure DevOps or GitHub Actions.
HP Data Center Stack Administration
- Administer HPE ProLiant and HPE Synergy compute platforms — server provisioning, firmware and BIOS management, iLO configuration, and hardware health monitoring.
- Manage HPE OneView for composable infrastructure — server profile templates, firmware baseline management, and data center resource orchestration.
- Administer HPE BladeSystem and Synergy chassis environments including Virtual Connect fabric modules, interconnect configuration, and enclosure management.
- Manage HPE 3PAR and/or HPE Nimble Storage arrays — volume provisioning, thin cloning, snapshot scheduling, replication, performance tiering, and capacity optimization.
- Operate and maintain networking infrastructure within the data center, including segmentation, VLAN management, and integration with core routing.
- Leverage HPE InfoSight and CloudPhysics (or equivalent HPE AI Ops tools) for predictive analytics, anomaly detection, and workload optimization across the HP stack.
- Coordinate with HPE support and field engineers for hardware break/fix, parts replacement, and proactive maintenance under active support contracts.
Operations, Security & Automation
- Monitor and respond to infrastructure alerts across on-prem and cloud using centralized observability platforms; drive incident resolution and root cause analysis.
- Automate operational workflows — patching, provisioning, remediation, compliance reporting — using PowerShell, Ansible, Python, or Azure CLI.
- Implement and enforce security controls: encryption at rest and in transit, vulnerability remediation, firewall policy management, logging, audit trails, and compliance with organizational and regulatory requirements.
- Administer core services: DNS, DHCP, certificate management, file and print, backup and replication, and endpoint configuration management.
- Maintain and test backup and recovery solutions for both on-premises and cloud workloads; document recovery procedures and ensure recoverability is validated regularly.
Collaboration & Continuous Improvement
- Partner with security, development, and IT operations teams to deliver resilient, scalable hybrid solutions and serve as an escalation point for complex infrastructure issues.
- Develop and maintain runbooks, standard operating procedures, architecture documentation, and operational playbooks.
- Contribute to infrastructure roadmaps and standards; proactively identify opportunities to reduce technical debt, improve efficiency, and increase platform maturity.
- Participate in on-call rotation and after-hours maintenance windows as required.
This position offers the flexibility of a hybrid schedule with the expectation of 3 days per week in our downtown Chicago office, and 2 days remote from home.