The role will be responsible for designing, building, and optimizing
high‑volume, real‑time and batch data platforms on Databricks
, with increasing adoption of
Databricks AI and enterprise AI tooling
.
The position requires strong
PySpark/SQL engineering
, hands‑on Databricks experience, exposure to
AI/ML platforms
, and solid
domain understanding of Commodity Trading
. The candidate should be able to
own solutions end‑to‑end
and work closely with business and technology stakeholders.
Key Responsibilities
Design, develop, and maintain
Databricks-based data pipelines
for
batch and real-time processing
Build scalable solutions using
PySpark, SQL, Delta Lake, and Delta Live Tables (DLT)
Implement
Structured Streaming
for high‑volume, low‑latency data processing
Optimize
performance and cost
across Databricks workloads
Implement
CI/CD pipelines
for Databricks notebooks, jobs, and workflows
Leverage
Databricks AI capabilities
to support ML and GenAI use cases
Collaborate with data scientists, AI engineers, and business teams to deliver analytics and AI solutions
Take full ownership of solutions from
design → build → deploy → support
Clearly communicate technical solutions and trade‑offs to stakeholders
Required Technical Skills
Databricks Engineering
Strong hands‑on experience with:
PySpark and SQL
Batch data processing
Delta Lake & Delta Live Tables (DLT)
Structured Streaming
High‑volume live data processing
Proven experience in
performance tuning and cost optimization
Experience implementing
CI/CD
for data and analytics platforms