Senior Data Scientist

Remote from
Seniority level
Job function
Data Science
Job type
Full Time,
Job posted
Apply before
22 Oct 2023
Computer Software

About Sonatype

Bringing you a better way to build software.

Sonatype is the software supply chain management company. We’re on a mission to change how the world innovates by making software development easier. From running the world’s largest repository of Java open-source components (Maven Central) to inventing componentized software development and then software supply chain management to creating the only solution that stops malicious open-source malware in its tracks, we’re constantly leading the industry while helping thousands of customers manage open source every day.
Already used by 15 million developers, we have lofty goals for our technology to be in the hands of every engineering team. And we need you to do that
Sonatype’s mission is to enable organizations to better manage their software supply chain.  We offer a series of products and services including the Sonatype Nexus Repository and Sonatype Lifecycle.
*** This position is 100% remote and candidates must currently live in the Canada or the US. ***
You’ll be working with one of our sophisticated research teams to help turn large amounts of data into valuable insights for our customers. We’re building our data science program so you’ll be helping to build out our standard processes as we grow. We have a large team of dedicated data engineers and data scientists so you can focus on doing what you do best, building models.

What You’ll Be Doing

  • Interacting with product management and data engineers to think through the potential ways to leverage data.
  • It is encouraged that you are an authority in machine learning so you will largely be driving the direction of your work since you know best what is possible.
  • Assure quality of models you’re producing and supervising them over time
  • Lead the research, development, and deployment of machine learning models for malicious behavioral analysis detection, demonstrating innovative techniques such as GANs, VAEs, and generative AI..
  • Collaborate closely with multi-functional teams, including data engineers, software developers, and domain authorities, to identify business requirements and translate them into practical data science solutions.
  • Explore and evaluate different generative AI approaches and algorithms to detect and predict malicious activities, anomalies, and behavioral patterns in diverse datasets.
  • Design and implement scalable and efficient data processing pipelines to collect, cleanse, and preprocess large-scale datasets for training and validation purposes.
  • Develop and implement feature engineering strategies, dimensionality reduction techniques, and data augmentation methods to improve the performance and generalization capabilities of the models.
  • Conduct in-depth exploratory data analysis and develop statistical models to identify patterns, correlations, and trends in data related to fraud and behavioral patterns.
  • Collaborate with the data governance team to ensure compliance with data privacy regulations and ethical considerations while working with critical customer data.
  • Stay updated on the latest research and advancements in generative AI, fraud detection, and behavioral analysis domains, and evaluate their applicability to enhance our existing models and methodologies.
  • Mentor and provide guidance to junior data scientists, assisting them in developing their technical skills and understanding of AI capabilities.
  • Present findings, insights, and model performance to both technical and non-technical partners, successfully communicating sophisticated concepts in a clear and concise manner.
  • Excellent problem-solving abilities and the capacity to develop innovative solutions to sophisticated data science challenges
  • Strong grasp of robust model validation techniques, including cross-validation and evaluation metrics suitable for assessing generalization performance.
  • Confirmed ability to implement data science standard methodologies
  • Proficiency using Jupyter or Databricks notebooks


  • Strong academic credentials in computer science, statistics, data science, machine learning or a related field.
  • 8+ years of hands-on experience as a data scientist.
  • Strong expertise in generative modeling, and deep learning architectures.
  • Thorough quantitative background.
  • Shown understanding of fraud detection techniques, anomaly detection, and behavioral analysis.
  • Proficiency in programming languages such as Python or R, and experience with relevant libraries and frameworks (e.g., TensorFlow, Keras, PyTorch, ScikitLearn).
  • Shown experience in working with large-scale datasets, data preprocessing, and feature engineering.


  • Familiarity with Databricks, AWS, S3, EMR, Sagemaker, would be beneficial
  • Experience with Git and preferably Github
  • PySpark, MLflow, LangChain, HuggingFace APIs
  • Our data engineers primarily use Java and Scala. We don’t expect you to be writing Java/Scala code, but familiarity may make it easier to work with the Data Engineers.

What We Offer

  • The opportunity to be part of an incredible, high-growth company, working on a team of expert colleagues
  • Competitive salary package
  • Medical/Dental/Vision benefits
  • Business casual dress
  • Flexible work schedules that ensure time for you to be you
  • 2019 Best Places to Work Washington Post and Washingtonian
  • 2019 Wealthfront Top Career Launch Company
  • EY Entrepreneur of the Year 2019
  • Fast Company Top 50 Companies for Innovators
  • Glassdoor ranking of 4.9
  • Come see why we’ve won all of these awards

Apply now >

Personalised job alerts

Set up personalised e-mail alerts about similar remote jobs

Report jobShare

How to apply

Did you apply? Let us know, and we’ll help you track your application.

See a few more

Similar remote jobs in Data Science

Job Search Safety Tips

Here are some tips to help you search and apply for jobs safely:
Watch out for suspicious jobs Don't apply for jobs that offer high pay for little work or offer to hire you without an interview. Read more ›
Check the employer's profile Make sure you're applying for a trustworthy job by visiting the employer's profile and learning more about them. Read more ›
Protect your information Don't share personal details like your bank account or government-issued ID on suspicious websites or messengers. Read more ›
Report jobs that feel unsafe If you see a job that seems misleading, inappropriate or discriminatory, report it for going against our policies and we'll review it.

Share this job


What position is Sonatype hiring for?

Sonatype is hiring a remote Senior Data Scientist from ,

What type of employment does Sonatype offer?

This is a Full Time role.