
1. Can you give me brief about your project, roles and responsibilities? Answer:
Project: Large-scale Advertising Platform
Role: DevOps Engineer
Responsibilities:
- Infrastructure Management: Designed and maintained scalable, highly available infrastructure on AWS/GCP using Terraform and CloudFormation.
- CI/CD Pipelines: Built and optimized Jenkins/GitLab CI pipelines for seamless deployments.
- Containerization: Dockerized microservices and managed Kubernetes clusters for orchestration.
- Automation: Automated repetitive tasks using Ansible, Shell scripting, and Python.
- Monitoring: Set up Prometheus, Grafana, and ELK stack for real-time monitoring and logging.
- Collaboration: Worked closely with dev and QA teams to ensure smooth releases and resolve environment issues.
- Security: Implemented security best practices (IAM, VPC, secrets management) and ensured compliance.
Impact: Improved deployment frequency by 40%, reduced downtime by 30%, and enhanced system reliability.
2. How many servers did you handle in production? Answer: In the advertising platform, I managed 500+ production servers across multiple environments (dev, staging, prod). These servers were a mix of bare metal, VMs, and cloud instances (AWS/GCP), handling high traffic and ad-serving workloads with 99.99% uptime.
3. What technologies were used in your project (frontend and backend)? Answer:
Frontend:
- JavaScript (React.js, Angular) for dynamic user interfaces.
- HTML/CSS for structuring and styling.
Backend:
- Java (Spring Boot) for microservices.
- Python (Django/Flask) for data processing and APIs.
- Node.js for real-time ad-serving and event-driven tasks.
Databases:
- MySQL, PostgreSQL for relational data.
- MongoDB, Cassandra for NoSQL needs.
Other Tech:
- Kafka for event streaming.
- Redis for caching.
- Docker/Kubernetes for containerization and orchestration.
- Terraform/Ansible for IaC and automation.
4. How do you configure the frontend on a server? Answer: - Deploy Code: Use git clone or SCP to transfer frontend code to the server.
- Web Server Setup: Use Nginx or Apache to serve static files.
- HTTPS: Use Let’s Encrypt for SSL certificates.
- Restart Web Server: Restart Nginx/Apache to apply changes.
- Firewall Rules: Allow HTTP (80) and HTTPS (443) traffic.
5. How do you ensure the frontend application runs securely? Answer: - Use HTTPS with SSL/TLS certificates.
- Add security headers (e.g., X-Content-Type-Options, X-Frame-Options).
- Implement CORS to restrict cross-origin requests.
- Sanitize user inputs to prevent XSS attacks.
- Regularly update dependencies and libraries.
- Use rate limiting to prevent brute-force attacks.
6. How do you troubleshoot and resolve CORS issues? Answer: - Check browser console for CORS error messages.
- Configure the backend to allow requests from the frontend domain.
- Add CORS headers in Nginx/Apache:
add_header 'Access-Control-Allow-Origin' 'https://frontend-domain.com';
- Handle preflight requests (OPTIONS) properly.
7. How do you manage backend servers with high availability and optimize latency? Answer:
- Use load balancers (e.g., AWS ALB, Nginx).
- Deploy servers in multi-region setups.
- Use caching (e.g., Redis) and CDNs for latency optimization.
- Monitor with tools like Prometheus and Grafana.
8. How do you connect to a server in a private network without internet access? Answer:
- Use a VPN to access the private network.
- Use a jump host/bastion host to SSH into the private server.
- Use out-of-band management tools like IPMI or iDRAC.
9. How do you reduce a 1GB file to 500MB? Answer: - Use compression tools like gzip, bzip2, or xz.
- For media files, reduce quality/resolution using FFmpeg or ImageMagick.
- Use truncate to forcefully resize the file (may corrupt data).
10. What is the difference between soft links and hard links? Answer: - Soft Link: Points to the file name, can span filesystems, breaks if the original file is deleted.
- Hard Link: Points to the inode, cannot span filesystems, remains valid if the original file is deleted.
11. What happens if you run rm -rf / as root? Answer: It
deletes everything on the system, making it
unbootable and
unusable. Always double-check commands before running them as root.
12. How do you check which process is consuming high memory? Answer:
Use
top (press Shift + M) or
htop to sort processes by memory usage. Alternatively, use: ps aux --sort=-%mem | head -n 10
13. How do you trace all application activities on a server? Answer: - Use auditd to track system calls and file access.
- Use strace to trace process activities.
- Use tcpdump to monitor network traffic.
- Use centralized logging tools like ELK Stack.
14. What is kernel patching, and how do you do it? Answer: Kernel patching involves updating the Linux kernel to fix bugs or vulnerabilities. Steps:
- Check the current kernel version:
uname -r
- Install updates:
sudo apt upgrade linux-image-$(uname -r)
- Reboot the system.
15. How do you manage sensitive information in Ansible? Answer: - Use Ansible Vault to encrypt sensitive data.
- Store secrets in environment variables.
- Integrate with HashiCorp Vault or cloud secret managers.
- Use no_log to hide sensitive output.
16. How do you manage task repetition in Ansible? Answer: Use
loops with the loop keyword: yaml - name: Create multiple users user: name: "{{ item }}" state: present loop: - alice - bob
17. How do you automate web server restarts after configuration changes? Answer: Use
handlers in Ansible: yaml - name: Update Apache configuration copy: src: files/apache.conf dest: /etc/apache2/apache2.conf notify: Restart Apache handlers: - name: Restart Apache service: name: apache2 state: restarted
18. How do you make one Ansible role depend on another? Answer: Use
dependencies in meta/main.yml: yaml dependencies: - role: common
19. What was the most difficult configuration you made with Ansible? Answer: Setting up a
multi-cloud, highly available Kubernetes cluster with automated disaster recovery. Benefits included resilience, scalability, and cost optimization.
1. How do you handle zero-downtime deployments in Ansible? Answer:
To achieve zero-downtime deployments: 1. Use
rolling updates in Ansible:
- name: Deploy application with zero downtime
hosts: webservers
serial: 1
tasks:
- name: Deploy new version
copy:
src: app-v2.war
dest: /var/lib/tomcat/webapps/app.war
- name: Restart Tomcat
service:
name: tomcat
state: restarted
2. Use
load balancers to route traffic away from the server being updated. 3. Ensure the application supports
graceful shutdowns and
session persistence.
2. How do you debug a failing Ansible playbook? Answer: 1. Use the -vvv
flag for verbose output:
ansible-playbook playbook.yml -vvv
2. Check the
task output for errors. 3. Use the
debug
module to print variable values:
- name: Debug variable
debug:
var: my_variable
4. Test individual tasks using
ansible
ad-hoc commands:
ansible webservers -m ping
3. How do you manage configuration drift in Ansible? Answer: 1. Use
idempotent tasks to ensure consistent configurations. 2. Run playbooks regularly to enforce desired states. 3. Use
ansible-pull
for continuous configuration enforcement. 4. Integrate with
CI/CD pipelines to detect and correct drift automatically.
4. How do you automate database migrations with Ansible? Answer: 1. Use the community.mysql
or community.postgresql
modules to manage databases. 2. Write tasks to apply SQL scripts:
- name: Apply database migration
community.mysql.mysql_query:
login_db: mydb
query: "{{ lookup('file', 'migration.sql') }}"
3. Use
handlers to restart database services if needed.
5. How do you manage secrets in Ansible for cloud environments? Answer:
1. Use Ansible Vault to encrypt sensitive data. 2. Integrate with cloud secret managers (e.g., AWS Secrets Manager, Azure Key Vault). 3. Fetch secrets dynamically using modules like community.aws.secretsmanager_secret
:
- name: Fetch database password from AWS Secrets Manager
community.aws.secretsmanager_secret:
name: db_password
register: secret
6. How do you optimize Ansible playbooks for large-scale environments? Answer:
1. Use serial
to limit the number of hosts updated simultaneously. 2. Enable pipelining in ansible.cfg
:
[ssh_connection]
pipelining = True
3. Use
free
strategy for parallel task execution:
- hosts: all
strategy: free
tasks:
- name: Run tasks in parallel
shell: sleep 10
4. Minimize the use of
gather_facts
if not needed.
7. How do you handle version control for Ansible playbooks? Answer: 1. Use Git for version control. 2. Organize playbooks into roles and collections. 3. Use tags to manage specific tasks or roles:
- name: Install Apache
apt:
name: apache2
state: present
tags: apache
4. Implement
CI/CD pipelines to test and deploy playbooks.
8. How do you manage Ansible roles for multiple teams? Answer:
1. Use Ansible Galaxy to share and reuse roles. 2. Create role dependencies in meta/main.yml
. 3. Use collections to bundle related roles and modules. 4. Implement code reviews and testing for shared roles.
9. How do you automate certificate management with Ansible? Answer:
1. Use the community.crypto
modules to manage certificates:
- name: Generate SSL certificate
community.crypto.openssl_certificate:
path: /etc/ssl/certs/server.crt
privatekey_path: /etc/ssl/private/server.key
csr_path: /etc/ssl/certs/server.csr
2. Use
Let’s Encrypt for automated certificate issuance:
- name: Request Let's Encrypt certificate
community.crypto.acme_certificate:
account_key_src: /etc/ssl/private/account.key
csr: /etc/ssl/certs/server.csr
dest: /etc/ssl/certs/server.crt
10. How do you handle dynamic inventory in Ansible for cloud environments? Answer:
1. Use dynamic inventory plugins (e.g., aws_ec2
, gcp_compute
). 2. Configure the plugin in ansible.cfg
:
[inventory]
enable_plugins = aws_ec2
3. Fetch inventory dynamically:
ansible-inventory -i aws_ec2.yml --list
4. Use
tags and
filters to group hosts dynamically.