My Project for GSoC 2024

Beginning of GSoC journey

My Organization

Contributing to NumFOCUS

I’m excited to announce that I’m contributing to NumFOCUS for Google Summer of Code (GSoC) 2024. NumFOCUS is a non-profit organization dedicated to promoting accessible and reproducible scientific and technical computing. Through its support and advocacy, NumFOCUS fosters world-class, innovative, open-source scientific software. Acting as an umbrella organization in Google Summer of Code, it provides a platform for various groundbreaking projects.

Collaboration with Open Science Labs

Within the NumFOCUS umbrella, my project is in collaboration with Open Science Labs. Open Science Labs (OSL) is a global platform committed to advancing scientific research through collaboration, innovation, and education. OSL focuses on fostering an inclusive, transparent, and accessible scientific community. It serves as a hub for sharing information and developing tools in open science and computational fields, particularly supporting Research Software Engineers (RSEs) in overcoming computational challenges in their work.


My project

Verify the reproducibility of an experiment.

Project Goals

This project is centered around one goals, which is to implement an algorithm to compare the provenance from two (or more) trials (i.e., executions of an experiment) to check their reproducibility. The given task concentrated on two specific challenges:

1. Compare Provenance Data

The first goal is to compare the values of trials data. The data given is a provenance of a trial. A provenance is a detailed record of the data, processes, and parameters involved in an experiment, allowing for the tracking and verification of results. Trial data is stored when running the experiments, which can be used later on with an SQLite database. The comparison will be using an Abstract Syntax Tree (AST) comparison, enabling a structured and precise examination of the trial data.

2. Algorithm for calculate reproducibility

The second goal focuses on the main idea of reproducibility. Reproducibility is the ability to achieve consistent results across different trials or experiments under the same conditions. The algorithm aims to evaluate the reproducibility by analyzing the provenance data from different trials. This analysis is crucial for identifying discrepancies and ensuring that experimental results are reliable and can be independently verified. The development of this algorithm will involve rigorous testing and validation to ensure its effectiveness in assessing reproducibility.


My Proposal

Check my proposal for NumFOCUS, GSoC 2024 here.
Tags: GSoC Blog