Gravitational Lens Finding Challenge
What is the best way to find a lens?
Finding strong gravitational lenses in the current imaging surveys is difficult. Future surveys will have orders of magnitude more data and more lenses to find. It will become impossible for a single human being to find them by inspection. In addition, to properly interpret the science coming out of strong lens samples it is necessary to accurately quantify the detection efficiency and bias of automated lens detectors. These open challenges are designed as a friendly way of stimulating activity and helping to quantify results in this regard.
Challenge 2.0 is an improvement in the simulated images over 1.0 and a change in the bands. This challenge concentrates only on Euclid-like observations in the VIS, and NISP J, Y and H bands. The pixels sizes are 0.1'' for VIS and 0.3'' for J, Y and H. The training and the test sets consists of 100,000 images in each band.
The training set consistes of 100,000 x 4 images and a catalog that gives the properties of each lens candidate. A description of each field is in the header of the catalog.
Data pack of images (19 GB)
Test sets will soon be available:
Data pack of images (19 GB)
Learning from our experience with the first challenge, a different method will be used to evaluate the submissions. A document describing scoring can be found here .
In addition to finding the lenses this time you can also submit and estimated Einstein ring radius for each lens that is identified. These sumissions will be scored by using the MSE.
A submission should consist of a two column, comma seporated text file. The first column should have the cutout identification number that is in the title of each file. The second column should have a score between 0 and 1 where larger values signify higher certainty of the candidate being a gravitational lens.
The submission forms are here: submission form
Closing date for submissions is February 7.
This challenge has expired. A discription of the results is in the paper Metcalf, et al 2018
This is the first of what we expect to be at least two generations of challenges, each becoming harder and more realistic. This challenge will concentrate on galaxies that are lensed by galaxies (i.e. no clusters and no quasars).
There are two parts to the challenge, C.1 and C.2, designed to mock different types of data sets. They require separate entries. You can entering both or either one.
Training sets of 20,000, 101px x 101px images are provided for both challenges. For each image in the training sets a classification as a lens or not lens is provided in a ASCII log file. Additional information about each lensed image is also provided such as the brightness of the lensed images. The training sets also contain images of the lens galaxy by itself and the lensed image by itself.
This data set mocks a space-based survey in one band. The challenge contains 100,000 images. The training set 20,000 images. The noise model is more idealized than for C.2 which may make classification somewhat easier.Space Based Training Set
This data set mocks a ground based, multi-band survey. The challenge data set consists of 100,000, 101px x 101px images in each of four bands (I,G,R,U). The training set 20,000 x 4 images. The bands should accurately represent galaxies at the appropriate redshifts although no redshift information is provided. There are observational artifacts and masked regions in the images.Ground Based Training Set
For each challenge a training set of images is provided above along with classification of each image as a lens and not a lens. These can be downloaded at any time.. It is important to keep in mind when training an algorithm that some objects will be classified as lenses, but the source may be too dim to be detected. Also, the C.2, ground based images do have some artifacts and masked regions (pixel values are set to 100 in masked regions).
When your algorithm is ready, and you are confident that there will be no problems with formats, i/o, etc., you can register for the challenge. You will be given access to a challenge data set in the same format as the training set. You will have 48 hours to classify the images and submit your entry. After this time entrees will be accepted, but they will not be eligible to win the challenge. This time limit is imposed to ensure that the method can really be considered automatic and that minimal human intervention is required.
You will be sent a report evaluating your entry. Authors of valid entries will be invited to participate in a publication presenting the results.
To register for the challenge and get access to the data goto these pages: Space.1 registration (3.6 gigabytes), Ground.1 registration (14 gigabytes). You will be sent a unique reference ID number for your entry. Warning: These files are large and you will not be able to start and stop the download process midway through.
Each participating team must submit a list with the ID number of each lens and a likelihood or confidence level, p , that it is a lens. p can either be binary (0 not a lens, 1 a lens) or a continuous number between 0 and 1 expressing the method's confidence that it is a lens.
The submission file must have two columns in ASCII separated by white space (not a tab). The first column is the ID number of the candidate lens as given in the image file names and the second column is p .
After February 5th 2017 no further entries will be accepted.
Entries will be evaluated by constructing the ROC. The ROC is the plot of the true positive rate, TPR, vs the false positive rate, FPR, defined as:
For a binary p this is a point. For a continuous p a ROC curve is made by plotting the ROC as a function of the p threshold for detection. A method that guesses at random would have TPR = FPR and a ROC curve that is a line from (0,0) to (1,1). The area under the ROC will be used as a figure of merit for evaluating the methods.
This procedure will be done for several cuts in the sample such on the brightness of the lensed image, the size of the lensed image and the size of the Einstein radius.
The Challenge is over and many people have asked for the data sets and truth tables.