Junior Data Engineer

As a Junior Data Engineer, you will work within our operations team to help build data solutions and tools that ensure our data platforms remain relevant for dataengine and our customers.

Our approach is to deploy whatever technology is best to solve the problem at hand. This often means that we are learning from the cutting edge of the Partner, Open-Source community or developing our own solutions.

What You Will Do:

Implement a range of data products to the agreed design specifications.
Implement the use of the data products, including data ingestion.
Assemble large, complex data sets that meet functional / non-functional business requirements.
Develop a deep understanding of a core component of the data platform stacks on Dataengine, AWS or Azure cloud, think through all of the moving pieces and make recommendations on design and configuration.
Work with the customer-facing engineers to design the Big Data solutions to clients in public and private sectors.
Design and implement modernised ETL platform and data processing solutions through modernised cloud based solutions (S3, Redshift, Cosmos etc) and deprecate legacy on-premise solutions
Build analytics platform tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
Work with stakeholders including the Executive, Product, Data and Design teams to assist with data-related technical issues and support their data platform needs.
Engage in service capacity planning, software upgrades, performance analysis and system tuning and maintenance.
Assist with level 1 – 3 support call for dataengine customers.
Prepare data for modelling and work closely with data scientist to identify model that drive business value.
Move models from product to scalable parallel processing framework and optimize it for best performance.
Prototype data for BI or data science projects using an agile methodology.
Own the development, and maintenance of ongoing metrics, reports, analyses, dashboards, etc. to drive key business decisions.
Recognise and adopt best practices in reporting and analysis: data integrity, test design, analysis, validation, and documentation.
Visualise data using modern visualisation tools.
Code pipelines using python, C share and other languages.
Maintain Azure and other cloud certifications.
Help document and mange deployed customer platforms.
Meet timeframe and quality requirements for tasks assigned to you.
Communicates the business benefits of the product and its features.

About You

3+ years of experience in a Data Engineer role, who has attained a Graduate degree in Computer Science, Statistics, Informatics, Information Systems or another quantitative field.
Experience building and optimizing ‘big data’ data pipelines, architectures and data sets.
Experience performing root cause analysis on internal and external data and processes to answer specific technical / business questions and identify opportunities for improvement.
Experience with Big data software stacks, specifically Hadoop and Apache Spark frameworks.
Strong analytic skills related to working with unstructured datasets.
Certificated in either Cloudera, Pentaho, Hortonworks, MapR or AWS data certifications.
Build processes supporting data transformation, data structures, metadata, dependency and workload management.
A successful history of manipulating, processing and extracting value from large disconnected datasets.
Working knowledge of message queuing, stream processing, and highly scalable ‘big data’ data stores.
Strong business analyses and organisational skills.
Advanced working SQL knowledge and experience working with relational databases, query authoring (SQL) as well as working familiarity with a variety of databases.
Experience supporting and working with cross-functional teams in a dynamic environment.
Experience using the following software/tools:
Experience with big data tools: Hadoop, Spark, Kafka, Sqoop, Hive etc.
Experience with Big Data and linux security features e.g. Sentry, Navigator
Experience with relational SQL and NoSQL databases, including MongoDB, Postgres and Cassandra.
Experience with Azure and AWS cloud services: EC2, EMR, RDS, Redshift. Cosmos, HDinsights.
Experience with stream-processing systems: Storm, Spark-Streaming, etc.
Experience with object-oriented/object function scripting languages: Python, Java, C++, Scala, Chef, Puppet, Ansible etc.
Experienced with working with linux operating system.
Growth mindset, positive attitude & and a willingness to learn.
Good writing and communications skills.
Able and willing to lean into technical tasks as set by the Architects.

Here’s a Taste of What’s on Offer at dataengine:

Opportunity to solve real-world business problems.
Collaborative and supportive work environment.
Competitive salary.
Be at the forefront of data science innovation in a dynamic startup environment.
Work on a variety of challenging and impactful projects across different industries.
Opportunity to learn and grow your skillset with the latest technologies.
Continuous learning and development opportunities.
Weekly fresh fruit, snacks, monthly industry led lunches and active staff activities.
Be part of a company that is shaping the future of data-driven solutions.

If you are a passionate and talented data geek who is excited about making a real impact, we encourage you to apply!

About dataengine:

dataengine, a New Zealand-based leader in data solutions, empowers businesses to unlock the hidden value within their data and achieve true transformation. We connect people, processes, and technology for optimised efficiency and sustainable impact. Founded in 2018, we’ve grown rapidly thanks to our exceptional team and commitment to delivering practical, impactful results. We believe in building the foundation right, starting with strategic planning and robust architecture.

What sets us apart?

Experienced & Pragmatic: Our team boasts extensive expertise, focusing on tangible outcomes, not just presentations.
Small Wins, Big Impact: We deliver incremental value drops, ensuring our customers see results quickly and continuously.
Holistic Enablement: We develop with our customers, in their environment, empowering our team to own and optimise solutions.

Leading the way in Generative AI: We’ve developed a unique standalone AI engine, providing secure and accessible tools for our customers’ AI journey. From use case brainstorming to model deployment and tuning, we offer comprehensive support throughout the entire lifecycle.

Beyond AI: Our services range from advisory and resourcing to full-scale platform design and build, catering to any data-driven business need.