Senior Machine Learning Engineer
What would you work on ?
Context
- Our products are a set of tools that scan GitHub public activity and git private repositories for security vulnerabilities.
- They are used by different teams: Software Development and Ops teams, Application Security, Threat Response and the buying decision comes from CISOs / CTOs / Directors of Security.
- By design GitGuardian is a data driven company. Both co-founders are former Data Scientists and the first product of GitGuardian is real-time processing of all new GitHub events. Our secret detection engine has been battle tested against huge amounts of data.
In this context, GitGuardian now wants to take it to the next level by incorporating Machine Learning models to create better vulnerability detectors and also improve internal performance efficiency. That’s why your work will matter and will be taken seriously!
...
The existing team
You will join the Data team, currently composed of 3 people focused on Data Engineering, and you will be the first Machine Learning Engineer. You will work closely with the Secrets team, which has built a strong knowledge around secrets detection and the data we manipulate. An end-of-study intern is already working on building Machine Learning models to create new secret detectors.
...
Missions
As a Machine Learning Engineer, you will have to:
- Lead the end-to-end development of scalable and reliable Machine Learning models that can be used to solve business problems. For instance, build a new generation of secret detectors using state-of-the-art LLMs.
- Identify areas in the company where Machine Learning can be applied. In particular, launch ML experiments to bring new in-app features to our existing products, like incident severity classification.
- Deploy tools to monitor the quality and performance of the models, while ensuring that they meet the business requirements
- Work closely with the Data Engineering team to ensure smooth data integration
- Collaborate with DevOps and Software Engineering teams to deploy models into our products, monitor their performance, and troubleshoot issues as needed
- Stay up-to-date with the latest advancements in NLP and ML technologies to implement new techniques into the existing models and foster a culture of innovation and continuous learning
- Communicate about cutting-edge ML applications at GitGuardian by writing blog posts, participating in meetups
Advantages
- You will build and deploy state-of-the-art models that bring high value to the business
- You will be able to leverage a huge amount of textual data collected from day 1
- The ML tooling landscape is still to be defined
- You will be part of a scale-up adventure with a strong engineering culture
Our technical stack
- Snowflake
- PostgreSQL, Elasticsearch, MongoDB
- Airbyte
- Metabase, Tableau
- GitLab
- AWS, Terraform, Docker, Kubernetes
...
More about you
If you think you are only matching 70% to 80% of these criterias, please send us your resume !
And if you still have some questions before applying, you can directly write to us at : careers@gitguardian.com
Hard skills
- 5+ years of hands-on experience in building, deploying and maintaining ML models with concrete business applications
- Solid analytical and advanced statistical skills
- Deep knowledge of state-of-the-art NLP techniques and models, especially LLMs
- Strong programming skills in one or more programming languages focused on data processing (Python, Scala, etc.) along with skills in application best practices (code modularity, unit tests, documentation, etc.)
- Fluent in ML libraries such as PyTorch, Transformers, spaCy, scikit-learn
- Strong experience in packaging and delivering ML models in production using cloud-based platforms
- Experience with Docker and MLOps tools (Airflow, MLFlow)
- Experience in using Hugging Face and transformers is a plus
- Experience in Data Warehousing (Snowflake, BigQuery) and data app prototyping (Streamlit, Dash) is a plus
Soft skills
- You like algorithms and new technology
- You like to write high quality and re-usable code
- You are used to perform applied research projects and bring them to production
- You are autonomous, proactive and curious
- You are a team player with strong communication skills. In particular, you should be able to work with cross-functional teams, and be able to communicate technical concepts to non-technical stakeholders.
- You are able to work in a fast-paced and dynamic environment, and adapt to changing requirements
- You speak fluent French and English
Bonus points
- You don’t embed API keys in your code ;)
- Deep understanding of the startups dynamics and challenges
- Have experienced strong team growth in a previous company
...
Why should you join us?
- 💰 Attractive package that includes stock-options
- 🏡 Remote-friendly environment: up to 3 days/week for people living in Île-de-France, (almost) full-remote policy for people living elsewhere in France.
- 💻 Home office allowance to improve your set-up at home, and the latest technology equipment
- 🌴 Yearly holiday allowance
- 🤝 Referral bonus of 4000€ for any new Guardians we might hire thanks to you
- 🍺 Lots of team-building activities including 1 per month for the whole company
- 🐕 Pet-friendly offices, every Guardian gets to bring their dogs
- 👊 Working on a meaningful product, we already helped more than 300k developers!
- 📈 A strong engineering culture, see this page to discover our R&D projects
- 🚀 Many opportunities for career development in the long term
- 👫 Trust & autonomy on your perimeter with a very transparent internal communication
Recruitment process
1 Visio call with a recruiter
To discover your professional project and evaluate if there could be a mutual match
1 Team interview / to meet the team and your future manager
To know more about yourself, present to you the team : missions, rituals, seniority level, and making sure you would be able to succeed in the following steps of the recruitment process
1 technical test
To evaluate your skills for the position and project yourself into the role
1 final interview with the CEO or the Engineering Manager
To explain to you our company’s vision and ambitions to the next couple of years, and make sure you are up for the position
Curious to know more about us ?
GitGuardian is a global post-series B cybersecurity startup; we've raised $44M by the end of 2021 with American and European investors including top-tier VC firms.
More than ever in 2023, we have a very solid business model with a fast-growing ARR, multi-year contracts and great customer retention rates.
Investors
- Among our early investors who saw our market value proposition, are the co-founder of GitHub, Scott Chacon, along with Docker co-founder / CTO Solomon Hykes 👀
Products
- We develop code security solutions for the DevOps generation and are a leader in the market of secrets detection & remediation.
- Our solutions are already used by hundreds of thousands of developers in all industries and GitGuardian Internal monitoring is the n°1 security app on the GitHub marketplace 🔥
Clients
- GitGuardian helps organizations find exposed sensitive information that could often lead to tens of millions of dollars in potential damage.
- We work with some of the largest IT outsourcing companies, publicly listed companies like Talend or tech companies like Datadog.
- More than 80% of our customers are in the United States.
People
- The majority of the team is based in Paris and we are growing fast, and in a sustainable way, while maintaining our core values.
- The Guardians are very knowledgeable, committed, serious, aligned with the company’s mission, and true team players : always willing to help pairs grow their skills !
- The team is diverse, come from more than 12 different nationalities and speak English regularly.
- We are also very agile, remote-friendly, pet-friendly 🐕 in the office, and fun people to work with.
- Team
- Engineering
- Locations
- GitGuardian Paris
- Remote status
- Hybrid Remote
Guardians of Code
We develop code security solutions for the DevOps generation and are a leader in the market of secrets detection & remediation.
Our solutions are already used by hundreds of thousands of developers in all industries and GitGuardian Internal monitoring is the n°1 security app on the GitHub marketplace. GitGuardian helps organizations find exposed sensitive information that could often lead to tens of millions of dollars in potential damage.
We love wearing our Guardians’ cape, and help each other achieve high ambitions!
Senior Machine Learning Engineer
Loading application form