Job Description
Key Responsibilities
- Ensure service reliability by defining/tracking SLI/SLO metrics and driving continuous improvement initiatives.
- Develop automation tools and internal solutions (using Go/Python) for deployment, rollback, configuration, and cluster management.
- Optimize CI/CD pipelines (including canary, blue-green, and progressive deployment strategies) for secure and efficient releases.
- Implement and maintain observability systems (metrics/logging/tracing with Prometheus/Grafana/OpenTelemetry) and establish effective alerting mechanisms.
- Lead incident response, root cause analysis, and post-mortem processes to enhance system resilience.
- Conduct capacity planning and performance optimization for high-throughput, low-latency workloads.
- Collaborate with security teams to manage secrets (Vault/KMS/HSM) and strengthen infrastructure security.
- Operate and monitor blockchain full nodes and RPC services, planning upgrade strategies and production integration.
Job Requirements
- Strong Linux and Kubernetes expertise with DevOps/SRE background and software engineering mindset.
- Proficiency in containers, CI/CD tools, and infrastructure-as-code (Terraform/Helm).
- Development skills for automation tools (Go preferred, Python acceptable).
- Understanding of networking, distributed systems, and microservice architectures.
- Experience operating blockchain nodes (Ethereum/Solana/Bitcoin etc.).
- Excellent troubleshooting skills, accountability, and cross-team communication.
- [Senior] Minimum 3 years DevOps/SRE experience supporting production systems.
Preferred Qualifications
- Experience running validator nodes or staking environments.
- Knowledge of incident management, runbook creation, and disaster recovery.
- Familiarity with databases, caching, and messaging systems (Postgres/Redis/Kafka).
- Network and security hardening expertise (TLS/firewalls/DDoS protection).
- Fluent English communication skills.
Benefits
- Work on cutting-edge Web3 projects with technology-driven, flat organizational structure.
- Highly competitive compensation package with performance bonuses.
- Abundant growth opportunities: attend top industry conferences and collaborate with elite developers.
- Flexible work arrangements including remote options and comprehensive benefits.
