Dashmote is an AI technology scale-up headquartered in Amsterdam, The Netherlands. With the goal of bridging the gap between images and data, we are working to bring AI-based solutions to marketers at clients like Heineken, Unilever, Philips, L’Oreal, and Coca-Cola. We add value in areas such as Location Analysis, Trends Analysis, and Marketing Intelligence.
Today, our company has offices in Amsterdam, Shanghai, Vienna, and New York. Over the past few years, our teams have solved a wide variety of cases, such as analyzing beer drinking and hairstyle trends by utilizing our Visual Recognition Tools, as well as identifying prospective leads by generating intelligence dashboards derived from Visual Content Analysis.
As our our very first dedicated Data Engineer in the Amsterdam office, you’ll be responsible for building our data pipeline architecture and working closely together with our Data Scientists, who are taking care of this at the current moment. We’re looking for someone who is either already an experience data pipeline builder and data wrangler who enjoys optimizing data systems and building them from the ground up, or someone who's keen on diving more into the field of data engineering as an experienced software engineer working with Python.
The typical responsibilities include:
Create and maintain optimal data pipeline architecture.
Assemble large, complex data sets that meet functional / non-functional business requirements.
Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and AWS technologies.
Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
A degree in Computer Science, Informatics, or another quantitative field.
Minimum 2-3 years of experience working with Python
Experience working with SQL and NoSQL databases (Elasticsearch is a plus point) and other nice to haves:
Technologies: Hadoop, Spark, Rabbitmq etc.
API Deployment: Docker or Serverless
AWS cloud services
Preferably already some experience building and optimizing ‘big data’ data pipelines, architectures and data sets (both structured and unstructured data sets)
Ability to build processes supporting data transformation, data structures, metadata, dependency and workload management.
Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
What's in it for you?