Wednesday, February 19, 2025

Urgent need on AWS Incident Management Specialist---Reston,VA(Onsite)--Need only Locals

Role: AWS Incident Management Specialist
Location: Reston,VA(Onsite)

Requirement: Only local candidates. Please provide resumes along with a local driver's license or ID.

Important Note: Please do not share profiles for non-local candidates.



Job Description

Key Job Functions

• Manage IT production incidents to resolution in a 24/7/365 environment using incident management processes and communicate incident status, impact, and resolution actions.

• Hands-on experience managing and monitoring applications deployed on Amazon Web Services (AWS).

• Troubleshoot and resolve incidents on the AWS cloud infrastructure.

• Experience with building tools for monitoring and troubleshooting system resources in an AWS environment. Ability to triage AWS-related incidents using monitoring tools on AWS Cloud.

• Experience with performance engineering of AWS Cloud applications.

• Hands-on experience working with AWS tools like EC2, ELB, RDS, Redshift, DynamoDB, Aurora, Route53, ECS, Lambda, S3, Batch, CloudWatch, CloudTrail, WAF, etc.

• Hands-on experience with transaction-level monitoring using Dynatrace and Splunk.

• Ability to perform transaction-level monitoring and troubleshooting in the AWS cloud platform.

• Monitor the health of applications and the underlying infrastructure.

• Monitoring experience with tools like Extrahop, SolarWinds, Netcool suite, Catchpoint, MoogSoft.

• Analyze dashboards and reporting/monitoring tools to identify trends and patterns in application health and performance.

• Proactively look for hardware, software, and environmental alerts or malfunctions.

• Effectively lead and guide incident triage calls from a technical perspective, analyzing different components of the infrastructure and application environment using a variety of monitoring tools and processes.

• Troubleshoot incidents and identify root causes quickly using operations, wire data analytics, application performance management, and event correlation monitoring tools.

• Perform analysis of data, evaluating multiple application protocols including web, database, storage, and supporting infrastructure such as AWS, UNIX, DNS, LDAP, SSL, SMTP, and FTP.

• Collaborate with technical teams and articulate troubleshooting steps effectively.

• Participate in technical follow-up calls for critical incidents.

• Assist with documentation of Root Cause Analysis (RCA) or Correction of Errors (COE) and data quality for all communicated incidents.

• Ensure appropriate functional and management escalation takes place as per the standards and procedures.

• Follow up on items that could potentially negatively impact production operations, assist with postmortem activities, and support various efforts related to operational improvements.

• Implement new and improved processes, change processes, perform new tasks, create reports, and address ad-hoc requests based on management recommendations.

• Participate in on-call rotation and work on any shifts as needed, including weekends and night shifts.

• Report incident details and metrics to senior leadership.

Minimum Experience Specialized Knowledge & Skills

• 6+ years of working experience with different IT Infrastructure components such as Unix/Linux Servers, Wintel Servers, AWS, networks, firewalls, routers, load balancers, VPN, Apache, WebLogic, LDAP, Active Directory, Exchange, Oracle/MS SQL databases, SAN, Virtualization, Email systems, Enterprise monitoring, and access management solutions for single sign-on. Experience with at least eight of the above is preferred.

• Mid-level hands-on working experience with Amazon Web Services (AWS).

• Understanding of different layers of the AWS Infrastructure e.g., WAF, R53, CloudFront, Load Balancing, HA features.

• Proven methodical approach to problem identification, monitoring, problem-solving, and resolution.

• Ability to analyze different components of the infrastructure and application environments during incident triage calls.

• Ability to trace transaction failures and debug the root cause in various layers of the AWS infrastructure and services.

• Aptitude to influence other technical teams on incident calls and articulate troubleshooting steps effectively.

• Experience and confidence working with all levels of management; excellent written and verbal skills.

• Ability to quickly and concisely communicate with senior management on technical issues in non-technical terms and to run large conference calls during incident calls with a wide range of personnel and management levels.

• Strong relationship management skills and aptitude to multi-task and work well in a high-stress environment, both within teams and independently.

• AWS Solution Architect Associate or higher certification.

• Monitoring and observability experience.

• Experience with monitoring dashboards for incident detection and alerting.

• Perform end-to-end analysis of transactions under an observability environment.

• Troubleshoot incidents and identify root causes quickly using wire data analytics, application performance management, and event correlation monitoring tools.

• Diagnose and resolve incidents by providing factual data from various monitoring and instrumentation systems.

• Monitor applications and infrastructure using tools like Splunk, Dynatrace, OpenTel, Catchpoint, xMatters, SignalFx, SolarWinds, Extrahop, etc.




--
You received this message because you are subscribed to the Google Groups "c2cactiverequirements2023Bharath3" group.
To unsubscribe from this group and stop receiving emails from it, send an email to c2cactiverequirements2023bharath3+unsubscribe@googlegroups.com.
To view this discussion, visit https://groups.google.com/d/msgid/c2cactiverequirements2023bharath3/CAAeB2gsiOXXv8QjW5Yv_qpnnKQ25xA2EkM9GPpaKGHYYo%2BfdLg%40mail.gmail.com.

No comments:

Post a Comment