The PositionRoche Sequencing Solutions is expanding with a Principal Data Scientist to lead and help drive forward R&D activity by providing a deep level of insight into experimental data, its interpretation and implications for project development efforts. This role seeks a highly advanced and versatile individual who combines deep statistical expertise with proficiency in advanced data analysis tools and a focus on applying these skills to complex biological problems. (Applicants whose primary experience is narrowly defined within established bioinformatics analysis workflows might not align for the scope of this Principal Data Scientist role.)
Principal data scientists combine a number of skills from different domains to organize, process, and learn from data, often through the lens of domain-expert informed models that help to abstract concepts from the data, test their validity and make predictions.
The Opportunity:
You will analyze large datasets, identify patterns, correlations, and anomalies that might be hidden within the data. Use statistical methods and machine learning algorithms to extract meaningful insights.
You will have experience with DOE for biological and assay development experiments, where we need to draw meaningful conclusions on small sample sizes.
You will provide statistical analysis of experimental results and communicate it to internal customers in a way that is both approachable and informative.
You will work cross functionally using various pilot studies to then integrate the necessary analytical or inferential routines into a pipeline through a collaborative software engineering effort; the overall goal resulting in a validated internal software product that may be used regularly by experimentalists to track their research progress.
You will utilize AI based tools for data analysis and create AI based support tools for internal customers.
You have the ability to create clear and informative visualizations of the data and analysis results, making it easier for the technical (and non technical) team(s) to understand the situation and make informed decisions.
You are able to identify technical challenges as they pertain to your team as well as collaborating teams, definite requirements and architecture for next-generation machine learning and statistical analysis products.
This position is based on-site in Santa Clara, CA.
Relocation benefits are not being offered for this role.
Who You Are:
(Required)
You have a PhD in Statistics, Data Science, Computer Science, Engineering, or other related areas of study and 5 years (or more) of experience handling various data sets/modules.
You are able to define and lead analysis projects interfacing with multiple stakeholders.
You have demonstrated experience handling large datasets, with demonstrated ability of transferring the data into meaningful and actionable reports.
You have demonstrated experience in the application of statistical methods for process control and optimization (e.g., Statistical Process Control, A/B testing design/analysis, or advanced time series modeling) in a non-biological context.
You have a demonstrated level of experience in a wide range of ML algorithms (traditional to advanced) and a strong understanding of the principles behind model training, validation and hyperparameter tuning.
You have practical experience in designing and implementing automated workflows; the ability to troubleshoot/map out a solution.
You have experience with genome sequencing data from multiple technologies; you have experience using machine learning to solve biological problems.
You are proficient in various programming tools, e.g. Python, R, Java, C++.
Preferred:
The expected salary range for this position based on the primary location of Santa Clara, CA is $128,000 - $282,00. Actual pay will be determined based on experience, qualifications, geographic location, and other job-related factors permitted by law. A discretionary annual bonus may be available based on individual and Company performance.
This position also qualifies for the benefits detailed at the link provided below.