Let’s talk about Signals

A still from signals.numer.ai film

Just give me the code

This notebook has taken inspiration from the example_model.py and Jason Rosenfeld’s notebook.

From a Numerai participant’s perspective 💡

From the tournament perspective, the main difference between these two is the data. Numerai main tournament provides you obfuscated, clean, and normalized data in a supervised learning manner i.e, features + targets. Signals on the other hand gives only a list of symbols or tickers in the Bloomberg universe and historical targets. That means, we have to collect all the data that can be a good feature for prediction.

source: signals.numer.ai
In both the tournaments, your predictions are scored based on their correlation with live targets.You can stake on your prediction's goodness and uniqueness(optional) and get paid based on these scores.Payouts: 
Numerai tournament: CORR (+ MMC)
Signals tournament: 2 * Corr (+ MMC)


  1. List of tickers (symbols) in the latest round
  2. Get price data
  3. Perform Feature Engineering
  4. Modeling
  5. Prediction on the latest data
  6. Submission

Tickers in the latest round

There are some changes in the list of tickers that the tournament asks for every week. Luckily, we have an API for that.

Getting the latest tickers using numerapi
Some tickers form latest universe
the tournament uses Bloomberg tickers for scoring.

Getting historical data 📒

Historical financial data can be costly. But there’s a way we can get it for free.

  1. Load Bloomberg tickers
  2. Map them to yahoo tickers using a dictionary
  3. Load data using yahoo tickers
  4. Map yahoo tickers back to Bloomberg tickers
Tickers mapping
Getting historical data for tickers
Left: Prices for all tickers on every available data in a single DataFrame. Right: Prices grouped by date for all available tickers.

Feature Engineering 📐

Raw data needs to be structured and cleaned. we can try any data that seems somehow related to the changes.

Functions for calculating technical indicators

Bringing order to complexity

We need to learn the relative ranking of tickers in a given era. For that, converting these indicators to quintile labels can help. These quintile labels and their daily change of the past few days can be used as input features to the model.

  1. Calculate these scores
  2. group by date (era) and create quintile labels. i.e., for ranking tickers on a single date
  3. group by ticker and create lagged features. i.e.,
  4. get daily changes in these lagged features
It is like embedding a sense of relative ranking among tickers for a day and their relative performance over the past few days using lagged features.This seems a really good structure for getting started with modeling. However, this is just a way of feature engineering. You should combine other ways to get higher MMC.

Looking at the past

After collecting and engineering historical data, we need targets for those features. Numerai provides historical targets for training and evaluation purposes. On which, we can train our models in a supervised learning manner.

Historical targets combined with features for modeling

Building for the future

Let’s train a simple model.

Training a simple model


The predictions are evaluated based on the per-era correlation with targets. Here are some metrics we can immediately check our predictions on.

Some evaluation and plotting calculations
scores on the validation set

Future prediction

For submission, we take the data form last Friday before the round starts. This will have lagged features from the last 5 days and changes in them.

Predictions on last Friday date
Diagnostics for historical predictions
Combining historical and live predictions

What’s next?

  1. Look for other indicators
  2. Try different modeling techniques
  3. Create and evaluate your own targets
  4. experiment and submit with more models
  5. Participate on RocketChat or Forum



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Suraj Parmar

Suraj Parmar

Solving problems, one sense, at a time. #ML , “The best way to learn is to teach.” parmarsuraj99@gmail.com About: https://parmarsuraj99.github.io/suraj-parmar/