SQL is a must-have skill for every data engineering interview. In most cases, the interviewer will provide you an ERD and instruct you to build queries to address basic question.
Python is the most commonly used programming language for creating data engineering pipelines.
Modeling of data:
Data modelling ideas will be tested on (e.g. star schema, facts, and dimensions).
Pipelines for data:
Typically, data engineering interviews include a question on data pipeline design. When creating the data pipeline, you will be questioned about testing, backfilling, scalability, dealing with bad data, resolving dependencies, and so on.
Streaming of events:
Learn about event streams, why they exist, and when to use them.
Fundamentals of distributed systems:
Depending on the work, you may be asked about the fundamentals of distributed systems. It’s a good idea to educate yourself on how distributed systems work.
This is a common question in most software professions. You will be questioned about developing data pipelines specifically for a data engineering position, where you will have to understand the source system.