Adrian Dominiczak, Developer in Warsaw, Poland
Adrian is available for hire
Hire Adrian

Adrian Dominiczak

Verified Expert  in Engineering

Data Engineer and Developer

Location
Warsaw, Poland
Toptal Member Since
July 21, 2020

Adrian是一名拥有近十年专业经验的高级大数据工程师. Adrian started his career as a software engineer at Samsung's R&他在Santander和Lingaro从事过一系列项目,从银行和制药行业的机器学习和大数据工程到大数据和云架构. Adrian's areas of expertise lie mainly with Hadoop and Spark.

Portfolio

Roche
Bamboo, GitLab CI/CD, Docker, SQL, Conda, Pandas, Python, YARN, Hadoop, Spark
Lingaro
Spark, Kubernetes, Apache Airflow, Microsoft Power BI, SQL, Python, Redis...
Santander Consumer Technology Services GmbH
Kudu, Apache Hive, SQL, Pandas, Python, Scala, Bash, RHEL, Spark, Cloudera...

Experience

Availability

Full-time

Preferred Environment

IntelliJ IDEA, PyCharm, Linux

The most amazing...

...我所做的事情是通过测量ML模型的准确性来优化Spark应用程序,同时监控客户端机器的健康状态.

Work Experience

Big Data and ML Engineer

2019 - 2020
Roche
  • Designed, implemented, 以及用Spark编写的产品化软件,用于监测监测医疗机器健康状态的统计模型的准确性.
  • 在自动医疗文档生成领域进行部署之前,通过重构现有项目来设计和改进项目结构, and retrieval knowledge from medical documents.
  • Designed and developed solutions for processing, auto-generation, and knowledge extraction from medical origin documents.
技术:Bamboo, GitLab CI/CD, Docker, SQL, Conda, Pandas, Python, YARN, Hadoop, Spark

Big Data Architect | Technical Leader

2019 - 2019
Lingaro
  • 代表一家软件公司,准备一份包含建筑设计的报价, scope, pricing for a project connecting several independent data platforms, with batch and NRT generated data, with data mart and dashboarding developed in MS Azure.
  • Provided architecture and team lead support in an acquired project.
  • 分析客户的业务需求,并将其转化为技术需求.
  • 使用敏捷方法协调项目的开发和交付.
  • 参与改进和重构代码,并指导年轻的开发人员.
  • Took part in sales activity.
Technologies: Spark, Kubernetes, Apache Airflow, Microsoft Power BI, SQL, Python, Redis, Microsoft Azure

Big Data Architect

2018 - 2019
Santander Consumer Technology Services GmbH
  • Monitored and provided improvements for the production of Hadoop clusters, ETL processes, and resource utilization.
  • Coordinated projects by serving as a single point of contact for stakeholders from the business domain and a team of developers; also monitored, planned, and reported on projects before going live.
  • 指导和管理一个由初级开发人员组成的小团队,同时使用敏捷方法领导PySpark报表应用程序的开发.
  • Set up development environments, test deployments of software from external providers; also created reports, documentations, and tutorials.
  • 分析来自外部供应商的解决方案的体系结构、功能和性能.
  • 参加外部软件供应商的会议,包括经理和架构师.
Technologies: Kudu, Apache Hive, SQL, Pandas, Python, Scala, Bash, RHEL, Spark, Cloudera, YARN, HDFS, Hadoop

Big Data and ML Engineer

2017 - 2018
Roche
  • Served as a machine learning and big data expert while obtaining external software (implemented in AWS) for extracting data from a medical origin document; also prepared for the internal knowledge transfer to a support team.
  • 通过在部署前按需重构现有项目,设计并改进了项目结构.
  • Designed and developed solutions for medical origin document analysis, processing, and auto-generation.
Technologies: Elasticsearch, Bamboo, GitLab CI/CD, Docker, SQL, Conda, Pandas, Python, YARN, Hadoop, Spark

Big Data Engineer

2015 - 2017
mBank S.A.
  • 实现了与S实时交易的算法交易软件(使用ML方法)&P 500 stocks.
  • Designed and implemented ML-based credit-scoring models.
  • 为托管在Hadoop集群上的业务数据的自定义可视化实现了一个web服务.
技术:JavaScript, H2, Play, SQL, R, Scala, Python, Java, Apache Sqoop, YARN, Hadoop, Spark

Software Engineer

2014 - 2015
Samsung Electronics Poland, R&D Center, Artificial Intelligence Group
  • 设计、实现并支持NLP用户话语识别引擎模块.
  • 实现了一个由语言学家内部使用的web欧博体育app下载,作为收集工具, cleaning, 以及标记数据集,用于训练NLP(自然语言处理)的机器学习模型。.
  • 实现了封闭领域的知识数据库和作为采购工具的web抓取器.
  • 实现了从Prolog到Java的连接器,以便利用内部Java库中以Prolog格式存储的知识数据库,在NLP领域构建统计模型.
Technologies: Weka, Prolog, JavaScript, SQL, Python, Java

Programmer

2013 - 2014
Polish Academy of Sciences
  • 利用机器学习和时间序列分析方法,找到了一种基于散射超声信号准确识别和区分骨骼内部结构的方法.
  • 利用先进的时间序列分析和复杂的网络数学框架,提出了一种基于超声信号识别皮肤癌变化的新方法.
  • 利用时间序列与复杂网络映射的数学框架,提出了一种研究医学原点时间序列的新方法.
Technologies: Mathematica, MATLAB, Python

Algorithmic Trading

Project: A machine learning-based algorithmic trading app running live on S&P 500 stocks.

我实现了用于培训的模块,并使用构建的模型进行日常预测,并参与了关于投资组合处理和再平衡的数学方法的讨论. 我还集成了来自各种数据源的数据:互联网、数据提供商等等.

Data Mart in MS Azure with a Dashboard

Project: An MS Azure cloud solution that fits the needs of the client.

I architected, designed, 并支持微软Azure云解决方案的开发,该解决方案将独立的数据平台与不同频率的生成数据同步, from one-day batches to NRT. I also designed the ETL pipelines, data storage, data mart, and a fast, efficient dashboarding solution.

Statistical Models Validation Software

项目:一个基于spark的应用程序,用于计算从所有传感器发送的预测医疗机器健康状态的统计模型的准确性.

我优化并实现了先进的PoC算法,并为生产部署做好了准备.

User Utterance Recognition

项目:为一家领先的电子制造商提供基于java的自然语言处理框架.

我用纯Java设计和实现了这个框架,并打算将其用作内部库. The framework performed sentence recognition using the hybrid engine, which was fed by both machine-learning-based and rule-based predictors.

Languages

Python, Java, SQL, JavaScript, Prolog, Scala, R, Bash

Frameworks

Spark, Hadoop, YARN, Play

Other

Big Data, Data Analytics, Data Engineering, Big Data Architecture, Applied Mathematics, Machine Learning, Statistics, Computational Physics, Conda, RHEL, Microsoft Azure

Paradigms

ETL Implementation & Design, ETL, Data Science

Platforms

亚马逊网络服务(AWS),谷歌云平台(GCP), Linux, Docker, Kubernetes

Storage

Databases, H2, Elasticsearch, HDFS, Apache Hive, Redis

Libraries/APIs

Pandas

Tools

PyCharm, IntelliJ IDEA, MATLAB, Mathematica, Weka, Apache Sqoop, GitLab CI/CD, Bamboo, Cloudera, Kudu, Microsoft Power BI, Apache Airflow

2011 - 2014

Master of Science (MSc) Degree in Applied Physics

Warsaw University of Technology, Faculty of Physics - Warsaw, Poland

2008 - 2011

Bachelor of Science (BSc) Degree in Physics

Warsaw University of Technology, Faculty of Physics - Warsaw, Poland

JUNE 2020 - PRESENT

Essential Google Cloud Infrastructure: Foundation

Coursera

JUNE 2020 - PRESENT

Google Cloud Platform Fundamentals: Core Infrastructure

Coursera

JUNE 2020 - PRESENT

Essential Google Cloud Infrastructure: Core Services

Coursera

Collaboration That Works

How to Work with Toptal

在数小时内,而不是数周或数月,我们的网络将为您直接匹配全球行业专家.

1

Share your needs

在与Toptal领域专家的电话中讨论您的需求并细化您的范围.
2

Choose your talent

在24小时内获得专业匹配人才的简短列表,以进行审查,面试和选择.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring