About the role:
Our client is seeking a Data Scientist with strong skills in data analysis to create large-scale analytical solutions (data products and applications) to client-driven business problems.
At the core of our client’s technology stack is a fast parallel processing database that powers all of our product offerings. Your work will involve using this core capability to analyze billions of rows of data to solve business problems. You will be responsible for conducting analysis to further the features, functionality, and data quality of the analytical solutions.
You will also work closely with the data engineering and operations teams to prototype and support high-performance data pipelines and the corresponding operational processes. Once deployed, you will be responsible for supporting ongoing data quality and stability of existing products and supporting client inquiries about the data.
You are curious and passionate about using your analytical skills in answering complex questions from real-world datasets. You are adept at working in line with business goals and strive to deliver high quality products to further our growth. You will primarily use the client’s Macro Language to interact with our core database. You will also use Airflow, Scala, Spark and Python to varying degrees.
You will be part of a diverse team with rich backgrounds in physics, computer science, statistics, finance, and engineering collaborating to build data products using massive data sets. You will need to develop domain expertise on data products and be able to propose novel solutions to challenging problems.
You will be based in the US and are comfortable working remotely with the Data Science, Data Quality, Data Engineering, and account management teams.
This role is not sponsorable
What you will take on:
- Become an expert user of Macro Language to analyze and process massive datasets
- Use your understanding of datasets and business problems to research analytical solutions to further product capabilities
- Use Macro Language, Spark, Scala, Python, and Airflow to design, build and maintain data pipelines
- Develop systems and processes to transform source data into knowledge and insights
- Collaborate with product management, data engineering, and data quality teams in producing high-quality products
- Conduct ad-hoc data analyses to resolve production and client data issues
- Find and implement optimizations, improvements, and design modifications to data engineering challenges
- Develop quick proof-of-concept analyses and audit decks on new source data
- Participate in agile development sprints and share your progress updates regularly
What you already have:
- BS/MS in a highly analytical discipline (Computer Science /Physics /Mathematics /Econometrics) or equivalent
- 1-3 years professional experience in data analysis or practical experience building customer-facing analytical products
- Database experience (understanding of database structures and query languages such as SQL)
- Demonstrated experience with scripting languages and statistical software (R, SAS, SPSS, MATLAB) and comfortable learning new tools as needed
- Solid understanding of statistical concepts
Desired
- Experience developing data products using consumer spend data
- Experience working with parallel processing frameworks like Spark
- Experience constructing data pipelines using Airflow
- Background in vector/matrix arithmetic is a plus
- Experience with list/vector-based languages
- Have strong, positive interpersonal skills
- Able to communicate clearly, consider options presented by others and reach an informed, balanced technical opinion
- Create clear, concise memos, summaries, design documentation, and presentations