The Sr. Cloud Engineer will report to the Cloud Engineering Manager in the Technology Department. This role is responsible for the design, administration, maintenance, and support of the enterprise infrastructure for a dynamic, growing business. The environment spans private and public clouds, networks, firewalls, servers, operating systems, applications, mobile devices, process schedulers, telecommunications, and general databases. The SCE will interface closely with the development teams, client service teams and business leaders to integrate, support, operate and provide infrastructure related to the Platform Architecture within Millennium Trust. Working within a team environment, the SCE participates directly in solution creation, providing hands-on support as well as operational support and training. This individual must be creative, client focused, solutions-driven, organized, and have the ability to thrive in a dynamic environment.
- Provide direct support and improve day‑to‑day operations of hardware and operating systems, including cloud services.
- Evaluate system utilization, monitor response time, and provide primary support for the detection and correction of operational issues.
- Coordinate and perform additions and changes to servers, networks, operating systems, and attached devices, including investigation, analysis, recommendation, configuration, installation, and testing of new network hardware and software.
- Ensure servers, operating systems, and network components are implemented in compliance with information security policies and infrastructure standards.
- Utilize metrics and cloud‑native consumption‑based services to improve cost efficiencies.
- Implement and configure cloud‑native services in CI/CD pipelines using Infrastructure‑as‑Code tools.
- Implement and configure Istio service mesh and Helm chart deployments.
- Implement and configure firewalls and security appliances.
- Implement and configure disaster recovery and business resumption plans related to backup and restoration of the technology infrastructure.
- Ensure runbooks are updated on a regular basis.
- Maintain VMware and cloud virtual environments.
- Maintain the Microsoft Active Directory domain.
- Mentor fellow cloud administrators and engineers.
- Manage, monitor, troubleshoot, and support existing projects and processes while collaborating with cross‑functional teams to define software requirements and propose solutions.
- Provide technical subject‑matter expertise and collaborate with broader teams to ensure development activities align with scope, schedule, priority, and business objectives.
- Maintain and upgrade deployment platforms and system infrastructure using Infrastructure‑as‑Code tools such as Terraform, ARM templates, PowerShell, or similar technologies.
- Troubleshoot AKS cluster‑based issues, including deployments through Azure DevOps pipelines.
- Utilize CLI, PowerShell, and ARM tools while collaborating with security, operations, engineering, and application teams to deliver complete Azure solutions.
- Provide infrastructure problem resolution for applications across the organization.
- Provide general SQL Server database troubleshooting and support.
- Utilize programming skills to design and develop scripts or programs for repetitive tasks.
- Perform all duties with a focus on Inspira’s goals, including risk mitigation.
- Support inbound calls and emails while maintaining tickets within the infrastructure support tracking system.
- Cross‑train team members to ensure adequate coverage.
- Perform other duties as assigned.
Qualifications
Years of Experience:
• 5-7 years of experience in Computer Science or equivalent experience
Degree:
• Bachelor’s degree in Computer Science or equivalent experience
Certification:
• Certification: AZ-900, AZ-104, AZ-700, AZ-500
Skills & Abilities: Minimum of 5 years of experience with:
- Develop and maintain CI/CD pipelines using Azure ADO while collaborating across teams to improve build, integration & release processes
- Assist in root cause analysis efforts and improvements to system reliability using techniques like self-healing, canary deployments, and alerting
- Build and deploy container workloads using Terraform
- Expertise in Kubernetes, AKS, Azure ADO, and Terraform
- Experience working on complex projects involving multiple teams and partners
- Implementing and configuring resilient, highly available IT systems
- Experience with Windows and/or Linux operating systems including building, security, and deployment in both physical and cloud environments
- Experience with Microsoft Azure PaaS and SaaS solution development technologies including Azure API Management, Azure Functions, Logic Apps, etc
- Experience with JSON, REST, and data-based APIs and high-scale performance services
- Experience with Azure Service Bus and Azure Notification Hub
- Exposure to logging and monitoring tools native to Azure
- Exposure to building disaster recovery and high availability solutions
- Ability to create clear and concise documentation of cloud and on‑premise data center infrastructure development and standards
- Networking knowledge with the ability to configure, maintain, and support Cisco Meraki devices, Azure VNets, VPN Gateways, Route Servers, Route Tables, and Private Endpoints
- Implement and configure Infrastructure as Code utilizing CI/CD pipelines
- Apply information security best practices and methodologies to operating systems, networks, and databases
- Experience with scripting languages such as PowerShell, Bash, JavaScript, Python, etc
- Experience with cloud services including Azure (preferred), Google Cloud, or AWS
- Experience with BDR solutions such as Veeam, VMware Site Recovery, and Azure Backup/Site Recovery
- Implement Microsoft 365 services including Exchange, Intune, SharePoint, and Teams
- Ability to implement and configure Kubernetes deployments utilizing custom resources
- Experience with general SQL Server database troubleshooting
- Ability to work independently with minimal supervision
- Excellent written and verbal communication skills
- Strong analytical skills with demonstrated follow-up and problem-solving abilities
- Ability to conduct research into hardware and software issues and products as required
- Ability to effectively prioritize and execute tasks in a high-pressure environment
- Strong interpersonal and presentation skills with the ability to collaborate with non-technical users, technical leads, and developers
- Knowledge of TCP/IP protocols and firewalls
- Knowledge of virtual machines and container concepts
- Knowledge of cloud security concepts including the Shared Security Model
- Experience working with ticketing systems and internal clients
- Ability to respond to emails and text messages after hours to resolve critical issues
- Strong personal diplomacy and client service skills with a high level of motivation, professionalism, teamwork, and trustworthiness
- Strong vendor management skills
- Highly self-motivated and directed
- Experience working in high availability environments preferred
- Knowledge of ITIL and ITSM practices and frameworks preferred