
at Tenth Revolution Group
£300 - £350 per day
London, EC3V 3LA, Greater London, GB
Remote | Full Time
Design, build, and maintain robust ETL/ELT pipelines for structured and unstructured data
Hands-on experience with AWS Glue and AWS Step Functions
Implementation of data validation, data quality frameworks, and reconciliation checks
Strong error handling, monitoring, and retry strategies in production pipelines
Experience with incremental data processing patterns (CDC, watermarking, upserts)
Amazon S3: data lake architectures, partitioning strategies, lifecycle policies
DynamoDB: data modeling, secondary indexes, streams, and performance optimization
Amazon Redshift: foundational querying, integrations, and performance considerations
AWS Lambda for scalable data processing and orchestration
Amazon EventBridge for event-driven and decoupled data pipelines
Strong understanding of vector database concepts, indexing strategies, and performance trade-offs
Design and implementation of embedding generation pipelines
Optimization techniques for semantic search and retrieval accuracy
Effective chunking strategies for document ingestion and processing
Experience with CockroachDB deployment and management is beneficial
Experience with PDF parsing libraries such as PyPDF2, pdfplumber, and AWS Textract
Integration of OCR solutions (AWS Textract, Tesseract) for scanned documents
Extraction of document structure (headings, tables, sections)
Metadata extraction, normalization, and enrichment
Handling of multiple document formats including PDF, HTML, and DOCX
Familiarity with SAP data structures is beneficial
Integration with PIM (Product Information Management) systems
Design and consumption of REST APIs
Python (advanced): pandas, numpy, boto3, and data processing best practices
SQL (advanced): complex queries, performance tuning, and query optimization
Data profiling and ongoing quality assessment
Schema validation and evolution strategies
Data lineage tracking and observability
Understanding of Master Data Management (MDM) concepts
Product catalog data models and hierarchies
E-commerce data patterns and integrations
B2B data exchange and system integration
To apply for this role please submit your CV or contact Dillon Blackburn on (phone number removed) or at (url removed).
Tenth Revolution Group are the go-to recruiter for Data & AI roles in the UK offering more opportunities across the country than any other recruitment agency. We're the proud sponsor and supporter of SQLBits, Power Platform World Tour, and the London Fabric User Group. We are the global leaders in Data & AI recruitment.