Are you adept at transforming and organizing varied, complex data? Do you have experience in data engineering with unstructured data? You might be the person we are looking for.
Our enterprise-wide data science team is seeking a top-notch data engineer with strong technical knowledge and a real passion for addressing business needs through data analysis. As an individual contributor on this team, you will create tools and data pipelines that leverage the latest advances in data engineering to address high-impact research and business questions across research and development, clinical, commercial, and general and administrative areas of our business. You ll work side-by-side with internal partners from across the organization to develop creative solutions for our highest priority business needs.
The ideal candidate will be a data engineer who is driven by building pipelines that feed data scientists with data and can work both independently as well as part of a highly collaborative team. We are seeking a candidate with the demonstrable ability to find solutions where others can t, who has the drive and determination to pull the team forward and persevere. We re looking for self-starters with a strong sense of urgency who thrive when operating in a fast-paced environment.
Write clean, maintainable data pipelines that feed data scientists
Correct, transform and enrich multiple sources of data
Quickly and efficiently load bulk and streaming data
Work closely with the data science team and internal business partners to identify the path to a successful product
Bachelor s degree or higher in Computer Science or related discipline
3+ years of experience using an ETL tool. Informatica or Talend is preferred
5+ years of experience with SQL database queries and programming
Experience programming in Java or Python
Familiarity with data quality, cleaning and masking techniques
Experience handling unstructured data
Experience working across multiple compute environments to create workflows and pipelines (e.g. HPC, cloud, Linux systems)
An ability to interact with a variety of large-scale data structures (e.g. HDFS, SQL, noSQL)Strong adherence to data privacy standards and ethics
Deep understanding of algorithms and performance optimization
Strong interpersonal and communication skills and a demonstrated ability to work and collaborate in a team environment
Previous experience in healthcare, life sciences, or pharmaceutical industry is a plus
Experience with AWS cloud technologies and stack
Knowledge of distributed data processing and management systems
Experience with big data analytics platforms and/or workflow tools
Demonstrated ability to organize and incorporate complex systems requirements into product features and prioritize features effectively
Associated topics: data administrator, data analyst, data integration, data integrity, database, etl, hbase, mongo database, sql, teradata
* The salary listed in the header is an estimate based on salary data for similar jobs in the same area. Salary or compensation data found in the job description is accurate.