Site Reliability Engineer (SRE)
About the role
Fnkt Labs builds cloud-first software for the F&B industry — ordering, POS integrations, kitchen display systems and retailer dashboards used by neighbourhood kopitiams to multi-outlet restos. We're a fast-growing product & ops team working closely with hospitality partners to reduce friction in day-to-day operations.
As a Site Reliability Engineer you'll be responsible for keeping our production systems healthy and performant. Expect to design and operate CI/CD pipelines, manage Kubernetes clusters, optimise deployment patterns and lead incident response. You will work cross-functionally with backend, mobile and product teams to embed reliability best practices into development and releases.
This role offers exposure to real-world F&B workloads: peak lunchtime traffic spikes from CBD lunch crowds, nightly batch processing for settlement and integration with a wide range of POS vendors. We offer hybrid working from our CBD office, an opportunity to shape our platform, and strong collaboration with product and operations teams aiming to expand across Singapore and regional markets.
About Fnkt Labs (Fork & Knife Technologies)
Fnkt Labs is the technology arm of a digital-first F&B group focused on solving operational pain points for restaurants, cafes and hawkers. We combine product development with hands-on hospitality experience to deliver stable, easy-to-integrate systems for order management, payments and analytics.
What you can expect
- Working product used by kopitiams and restaurants across Singapore
- Hybrid work model with CBD office near multiple hawker centres
- Opportunity to shape platform reliability for real F&B transaction patterns
- Learning budget and tech mentorship from senior engineers
Key responsibilities
- Operate and maintain production Kubernetes clusters (EKS / GKE) and supporting infrastructure in cloud (AWS / GCP).
- Design, implement and own monitoring, alerting and runbooks using Prometheus, Grafana and logging stacks.
- Lead incident response and post-incident reviews; drive reliability improvements to eliminate repeat incidents.
- Build and maintain CI/CD pipelines (GitHub Actions, GitLab CI, or Jenkins) for multiple services.
- Capacity planning and cost optimisation for infrastructure resources during peak dining periods.
- Automate repetitive operational tasks using scripts and IaC (Terraform / CloudFormation).
- Collaborate with backend and product teams to design resilient APIs and deploy safe rollout strategies (canary, feature flags).
- Participate in on-call rotation and provide timely support for production issues including weekends/public holidays as scheduled.
Requirements
- 3+ years experience in Site Reliability, DevOps or Platform engineering in a production environment.
- Strong Linux systems administration skills and troubleshooting experience.
- Hands-on experience with container orchestration (Kubernetes) and container tooling.
- Proficiency with at least one cloud provider (AWS or GCP) and infrastructure-as-code tools (Terraform preferred).
- Experience setting up observability: Prometheus, Grafana, ELK/Opensearch or equivalent.
- Scripting ability with Python, Go, Bash or similar for automation tasks.
- Comfortable with CI/CD pipelines and git-based workflows.
- Willingness to join rotational on-call duties; able to respond to incidents outside core hours when required.
Benefits
- Competitive salary with performance bonus and stock option eligibility for key hires.
- Hybrid work model (3 days office / 2 days remote typical) and flexible start times.
- Medical insurance and outpatient coverage.
- Learning & conference allowance and company-paid training days.
- Monthly meal stipend and staff discounts at partner F&B venues.
- Team offsites and regular tech knowledge-sharing sessions.
- Transport allowance or commuting subsidy for eligible employees.
Work schedule
Typical week: 5 days per week, hybrid office arrangement; participation in on-call rota which includes occasional weekends and public holidays.
- Core hours: 10:00–16:00 with flexible start/end times.
- Standard shift: Monday–Friday working hours with hybrid remote options.
- On-call rotation: one week per month (evening and weekend support as needed).
How to apply
Email your CV and a short note about your most relevant SRE experience to [email protected] with subject line: "SRE Application — [Your Name]".
Apply Now via EmailMore jobs to consider
Executive Chef (Luxury Hotel)
S$8,000 – S$12,000 per month
Senior Mixologist / Bartender
S$3,200 – S$4,500 per month
Customer Support Agent (Swedish Speaking)
€1,700 – €2,200 per month
Customer Support Agent (Finnish Speaking)
€1,500 – €1,900 per month