Data Engineer
Millennium is a global investment management firm, built on a sophisticated operating system at scale. We seek to pursue a diverse array of investment strategies, and we empower our employees to deliver exceptional outcomes and enable our portfolio managers to do what they do best – navigate the markets.
Job Description
We are building a world-class systematic data platform which will power the next generation of our systematic portfolio engines.
The latency critical trading group is looking for a Data Scientist/Engineer to join our growing team. The team consists of low latency linux engineers, network engineers, datacenter engineers, and C++ engineers who are responsible for building our low latency stack.
This is an opportunity for individuals who are passionate about quantitative investing. The role builds individual's knowledge and skills in four key areas of quantitative investing, namely: data, statistics, technology, and financial markets.
Desirable Candidates
Ph.D. or master's in computer science, mathematics, statistics, or other field requiring quantitative analysis or equivalent experience
Three years of financial industry experience
Experience working with systematic investing teams is a plus
Programming expertise in Python and C++
Data analysis experience with R, MatLab, scitoolkit, pytorch, or similar
Programming skills in SQL
Strong problem-solving skills
Effective communication skills
Strong understanding of modern statistical testing methods
Job Responsibilities
Monitoring and Analysis
Accessing quality of all captured market data - both live and historical
Inventorying all gaps and handling closing gaps via multiple techniques (other internal captures, vendor data, exchange data, etc)
Extending parsing and identification of other gaps
Understanding and Documenting session times including holiday schedules
Comparing day-on-day changes in latency, data rate, bursts, etc
Research
Documenting history of exchange microstructure behavior changes
Understanding and Documenting message timestamp rules for each exchange
Documenting history of session schedule changes and protocols
Alpha research and presentation to portfolio managers and trading teams
Building technology tools to acquire and tag datasets
Engaging with vendors, brokers to understand characteristics of datasets
Interacting with portfolio managers and quantitative analysts to understand their use cases
Building exchange microstructure expertise and helping educate clients as needed
Analyzing datasets to generate key descriptive statistics
Cleaning historical data
Utilizing and maintaining world-class data processing and transformation techniques
Tools
Building data analysis tools on top of our captured data
Improving data visualization tools and capabilities
Technical
Consolidating several PCAPs in a single PCAPNG with meta-data to represent capture location/time across capture regions
Assessing PTP (Precision Time Protocol) quality within a sole source and across multiple Manage large data sets in a hybrid multi-cloud/on-prem sources environment
Build performant systems able to analyze terabytes of data quickly
Enhance firm C++ analytics and expose them in a python environment