Site Reliability Engineer (SRE)

Fnkt Labs (Fork & Knife Technologies) Singapore (Raffles Place / Central Business District) Hybrid

Type: Full-time Level: Mid-level Salary: S$7,000 – S$11,500 per month

site reliability engineer sre cloud kubernetes fnB tech singapore hybrid full-time

About the role

Fnkt Labs builds cloud-first software for the F&B industry — ordering, POS integrations, kitchen display systems and retailer dashboards used by neighbourhood kopitiams to multi-outlet restos. We're a fast-growing product & ops team working closely with hospitality partners to reduce friction in day-to-day operations.

As a Site Reliability Engineer you'll be responsible for keeping our production systems healthy and performant. Expect to design and operate CI/CD pipelines, manage Kubernetes clusters, optimise deployment patterns and lead incident response. You will work cross-functionally with backend, mobile and product teams to embed reliability best practices into development and releases.

This role offers exposure to real-world F&B workloads: peak lunchtime traffic spikes from CBD lunch crowds, nightly batch processing for settlement and integration with a wide range of POS vendors. We offer hybrid working from our CBD office, an opportunity to shape our platform, and strong collaboration with product and operations teams aiming to expand across Singapore and regional markets.

About Fnkt Labs (Fork & Knife Technologies)

Fnkt Labs is the technology arm of a digital-first F&B group focused on solving operational pain points for restaurants, cafes and hawkers. We combine product development with hands-on hospitality experience to deliver stable, easy-to-integrate systems for order management, payments and analytics.

What you can expect

Working product used by kopitiams and restaurants across Singapore
Hybrid work model with CBD office near multiple hawker centres
Opportunity to shape platform reliability for real F&B transaction patterns
Learning budget and tech mentorship from senior engineers

Key responsibilities

Operate and maintain production Kubernetes clusters (EKS / GKE) and supporting infrastructure in cloud (AWS / GCP).
Design, implement and own monitoring, alerting and runbooks using Prometheus, Grafana and logging stacks.
Lead incident response and post-incident reviews; drive reliability improvements to eliminate repeat incidents.
Build and maintain CI/CD pipelines (GitHub Actions, GitLab CI, or Jenkins) for multiple services.
Capacity planning and cost optimisation for infrastructure resources during peak dining periods.
Automate repetitive operational tasks using scripts and IaC (Terraform / CloudFormation).
Collaborate with backend and product teams to design resilient APIs and deploy safe rollout strategies (canary, feature flags).
Participate in on-call rotation and provide timely support for production issues including weekends/public holidays as scheduled.

Requirements

3+ years experience in Site Reliability, DevOps or Platform engineering in a production environment.
Strong Linux systems administration skills and troubleshooting experience.
Hands-on experience with container orchestration (Kubernetes) and container tooling.
Proficiency with at least one cloud provider (AWS or GCP) and infrastructure-as-code tools (Terraform preferred).
Experience setting up observability: Prometheus, Grafana, ELK/Opensearch or equivalent.
Scripting ability with Python, Go, Bash or similar for automation tasks.
Comfortable with CI/CD pipelines and git-based workflows.
Willingness to join rotational on-call duties; able to respond to incidents outside core hours when required.

Benefits

Competitive salary with performance bonus and stock option eligibility for key hires.
Hybrid work model (3 days office / 2 days remote typical) and flexible start times.
Medical insurance and outpatient coverage.
Learning & conference allowance and company-paid training days.
Monthly meal stipend and staff discounts at partner F&B venues.
Team offsites and regular tech knowledge-sharing sessions.
Transport allowance or commuting subsidy for eligible employees.

Work schedule

Typical week: 5 days per week, hybrid office arrangement; participation in on-call rota which includes occasional weekends and public holidays.

Core hours: 10:00–16:00 with flexible start/end times.
Standard shift: Monday–Friday working hours with hybrid remote options.
On-call rotation: one week per month (evening and weekend support as needed).

How to apply

Email your CV and a short note about your most relevant SRE experience to [email protected] with subject line: "SRE Application — [Your Name]".

Apply Now via Email

Site Reliability Engineer (SRE)

About the role

About Fnkt Labs (Fork & Knife Technologies)

What you can expect

Key responsibilities

Requirements

Benefits

Work schedule

How to apply

More jobs to consider

Executive Chef (Luxury Hotel)

Senior Mixologist / Bartender

Customer Support Agent (Swedish Speaking)

Customer Support Agent (Finnish Speaking)