Technical Support Engineer
About Job
Location: Istanbul (Türkiye)
Shift work
Who are we?
Founded in 2009, Alibaba Cloud is a leading cloud computing and artificial intelligence company. Utilizing its proprietary Apsara Cloud operating system, Alibaba Cloud offers its global customers a comprehensive suite of cloud services based on a three-tier architecture: Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Model as a Service (MaaS). Alibaba Cloud is the largest public cloud services provider in China and the Asia Pacific region. With its large scale and strong foundation in IaaS and PaaS infrastructure,
Alibaba Cloud provides enterprises with high-performance, low-cost computing resources and platform services for large-scale model training, fine-tuning, and inference. Alibaba Cloud aims to be the most open cloud computing company in the age of artificial intelligence. Alibaba Cloud's proprietary LLM, Qwen, is one of the world's leading LLMs and supports more enterprise customers in realizing their AI-driven innovations by releasing Qwen solutions of varying sizes and multimodal configurations as open source.
We are looking for a Cloud Support Engineer with strong troubleshooting skills across cloud infrastructure, security, databases, and cloud-based environments. Your responsibilities will include diagnosing and resolving complex technical issues affecting cloud-based systems, ensuring high availability, performance, and security. You will play a key role in ensuring system reliability, performance, and security for cloud-based platforms.
Main Responsibilities
Cloud and Infrastructure (Resilient Computing)
Troubleshooting Linux/Windows system problems and performance bottlenecks.
Diagnose problems related to the virtual machine (startup errors, connection problems, disk mounting problems).
Manage cloud instances, snapshots, mirroring, and lifecycle operations.
Analyze system metrics (CPU, memory, I/O, load, OOM)
It supports virtual environments (KVM, VMware, Xen).
Cybersecurity
Investigate network and security incidents (DDoS, SQL injection, XSS, etc.).
Analyze the logs and packet captures (Wireshark/tcpdump)
Troubleshooting WAF, firewall, and security group issues.
Identify misconfigurations in access control, routing, and policies.
Address issues related to TLS/SSL and certificates.
Databases and Big Data
Troubleshooting database performance issues (slow queries, deadlocks, replication)
Optimize SQL queries and indexing strategies.
Monitor database performance and resource usage.
It supports backup/restore and high availability deployments.
Diagnose problems in big data systems (Hadoop, Spark, Kafka, etc.)
Cloud-based / Containers
Troubleshooting Kubernetes workloads and cluster issues.
Diagnose pod malfunctions (CrashLoopBackOff, OOMKilled, Pending)
Resolving network and service access issues (Login, DNS, CNI)
Managing container work environments (Docker/containerd)
Support CI/CD and Helm-based distributions.
Required Skills
- At least 3-5 years of domain knowledge and experience (in at least one area) in Virtualization and Cloud Platforms, Cybersecurity Operations, Databases/Big Data, and Kubernetes/Cloud-based Applications.
- Mastering the basic principles of Linux (file system, input/output, memory, processes)
- Solid knowledge of network communications (TCP/IP, DNS, HTTP/HTTPS)
- Experience troubleshooting distributed systems.
Having the following is an advantage:
- Experience with OpenStack or public cloud platforms
- Familiarity with tools like Prometheus, Grafana, and ELK.
- Knowledge of automation and monitoring tools
- Relevant certificates (e.g., Security+, CISSP, Kubernetes, Cloud certificates)
- Scripting knowledge (Shell/Python)
Job requirements:
- Shift work. At least one night shift per month.
- Weekend shift
- Fully equipped in Istanbul.
If you're ready to make a difference and believe you're a good fit for this position, please apply or contact our recruitment specialists.