Data Engineer- Machine Learning
525 Brannan Street Unit 308 San Francisco, CA 94107
Title: Data Engineer- Machine Learning
Company: AI/ML SaaS Platform- Medical Data
Location: San Francisco, CA
JBC is searching for brilliant Data Engineers, with machine learning experience, to work on data products that drive the core business--a backend expert able to unify data, and build systems that scale from both an operational and an organizational perspective. Our client is on a mission to understand and structure the world’ s medical data using Machine Learning and Artificial Intelligence!
As a Data Engineer you will:
- Develop data infrastructure to ingest, sanitize and normalize a broad range of medical data, such as electronics health records, journals, established medical ontologies, crowd-sourced labelling and other human inputs.
- Build performant and expressive interfaces to the data
- Build infrastructure to help not only scale up data ingest, but large-scale cloud-based machine learning
We’ re looking for Engineers who bring:
- Experience building data pipelines from disparate sources
- Hands-on experience building and scaling up compute clusters
- Excitement about learning how to build and support machine learning pipelines that scale not just computationally, but in ways that are flexible, iterative, and geared for collaboration.
- A solid understanding of databases and large-scale data processing frameworks like Hadoop or Spark.
- You’ ve not only worked with a variety of technologies, but know how to pick the right tool for the job.
- A unique combination of creative and analytic skills capable of designing a system capable of pulling together, training, and testing dozens of data sources under a unified ontology.
Bonus points if you have experience with:
- Developing systems to do or support machine learning, including experience working with NLP toolkits like Stanford CoreNLP, OpenNLP, and/or Python’ s NLTK.
- Expertise with wrangling healthcare data and/or HIPAA.
- Experience with managing large-scale data labelling and acquisition, through tools such as through Amazon Turk or DeepDive.