GSoC 2024 - Week 12 Progress

Wrap Up GSoC 2024


Additional Feature

This week, I’ve introduced a significant new feature to align with the project goals. The now ast command now allows access to the trial’s Abstract Syntax Tree (AST). By default, the output is a string generated by ast.dump that is based of trial’s definition rather than the script itself, as shown below:

now ast 1.1.1 
Module(body=[Import(names=[alias(name='numpy', asname='np')]), Import(names=[alias(name='matplotlib.pyplot', asname='plt')]), ImportFrom(module='precipitation', names=[alias(name='read'), alias(name='prepare')], level=0), FunctionDef(name='bar_graph', args=arguments(posonlyargs=[], args=[arg(arg='years')], kwonlyargs=[], kw_defaults=[], defaults=[]), body=[Global(names=['PREC', ' MONTHS']), Expr(value=Call(func=Name(id='prepare', ctx=Load()), args=[Name(id='PREC', ctx=Load()), Name(id='MONTHS', ctx=Load()), Name(id='years', ctx=Load()), Name(id='plt', ctx=Load())], keywords=[])), Expr(value=Call(func=Attribute(value=Name(id='plt', ctx=Load()), attr='savefig', ctx=Load()), args=[Constant(value='"out.png"')], keywords=[]))], decorator_list=[], type_params=[]), Assign(targets=[Name(id='MONTHS', ctx=Load())], value=BinOp(left=Call(func=Attribute(value=Name(id='np', ctx=Load()), attr='arange', ctx=Load()), args=[Constant(value='12')], 
keywords=[]), op=Add(), right=Constant(value='1'))), Assign(targets=[Tuple(elts=[Name(id='d13', ctx=Load()), Name(id='d14', ctx=Load())], ctx=Load())], value=Tuple(elts=[Call(func=Name(id='read', ctx=Load()), args=[Constant(value="'p13.dat'")], keywords=[]), Call(func=Name(id='read', ctx=Load()), args=[Constant(value="'p14.dat'")], keywords=[])], ctx=Load())), Assign(targets=[Name(id='PREC', ctx=Load()), Tuple(elts=[Name(id='prec13', ctx=Load()), Name(id='prec14', ctx=Load())], ctx=Load())], value=Tuple(elts=[List(elts=[], ctx=Load()), List(elts=[], ctx=Load())], ctx=Load())), For(target=Name(id='i', ctx=Load()), iter=Name(id='MONTHS', ctx=Load()), body=[Expr(value=Call(func=Attribute(value=Name(id='prec13', ctx=Load()), attr='append', ctx=Load()), args=[Call(func=Name(id='sum', ctx=Load()), args=[Subscript(value=Name(id='d13', ctx=Load()), slice=Name(id='i', ctx=Load()), ctx=Load())], keywords=[])], keywords=[])), Expr(value=Call(func=Attribute(value=Name(id='prec14', ctx=Load()), attr='append', ctx=Load()), args=[Call(func=Name(id='sum', ctx=Load()), args=[Subscript(value=Name(id='d14', ctx=Load()), slice=Name(id='i', ctx=Load()), ctx=Load())], keywords=[])], 
keywords=[]))], orelse=[]), Expr(value=Call(func=Name(id='bar_graph', ctx=Load()), args=[List(elts=[Constant(value="'2013'"), Constant(value="'2014'")], ctx=Load())], keywords=[]))], type_ignores=[])

There are also two additional output options: JSON (integrated into now ast from Week 11) and Graphviz DOT format. The DOT format uses the Graphviz library to build a structure that represents the trial. This particular DOT graph was developed during Weeks 3-6, but now outputs in DOT instead of SVG or PNG. To retrieve the trial’s AST in DOT format, simply add -d.

Preview dot output ```bash now ast 1.1.1 -d digraph { nodesep=0.75 rankdir=TB ranksep=0.75 node2469938757328 [label="Module experiment.py" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469938836304 [label="Import import numpy" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469938757328 -> node2469938836304 [arrowsize=0.5 minlen=1] node2469955993552 [label="alias numpy as np" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469938836304 -> node2469955993552 [arrowsize=0.5 minlen=1] node2469957251472 [label="Import import matplotlib.pyplot" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469938757328 -> node2469957251472 [arrowsize=0.5 minlen=1] node2469938506640 [label="alias matplotlib.pyplot as plt" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957251472 -> node2469938506640 [arrowsize=0.5 minlen=1] node2469957252048 [label="ImportFrom from precipitation import read, prepare" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469938757328 -> node2469957252048 [arrowsize=0.5 minlen=1] node2469957247632 [label="alias read" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957252048 -> node2469957247632 [arrowsize=0.5 minlen=1] node2469957252112 [label="alias prepare" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957252048 -> node2469957252112 [arrowsize=0.5 minlen=1] node2469956697232 [label="FunctionDef def bar_graph(years): global PREC, MONTHS prepare(PREC, MONTHS, years, plt) plt.savefig(\"out.png\")" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469938757328 -> node2469956697232 [arrowsize=0.5 minlen=1] node2469957292432 [label=arguments fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469956697232 -> node2469957292432 [arrowsize=0.5 minlen=1] node2469957113680 [label="arg years" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957292432 -> node2469957113680 [arrowsize=0.5 minlen=1] node2469956256848 [label="Global global PREC, MONTHS" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469956697232 -> node2469956256848 [arrowsize=0.5 minlen=1] node2469957292176 [label="Expr prepare(PREC, MONTHS, years, plt)" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469956697232 -> node2469957292176 [arrowsize=0.5 minlen=1] node2469957293776 [label="Call prepare(PREC, MONTHS, years, plt)" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957292176 -> node2469957293776 [arrowsize=0.5 minlen=1] node2469957292304 [label="Name prepare" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957293776 -> node2469957292304 [arrowsize=0.5 minlen=1] node2469957291856 [label="Name PREC" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957293776 -> node2469957291856 [arrowsize=0.5 minlen=1] node2469957292816 [label="Name MONTHS" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957293776 -> node2469957292816 [arrowsize=0.5 minlen=1] node2469957294352 [label="Name years" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957293776 -> node2469957294352 [arrowsize=0.5 minlen=1] node2469957294864 [label="Name plt" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957293776 -> node2469957294864 [arrowsize=0.5 minlen=1] node2469957294928 [label="Expr plt.savefig(\"out.png\")" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469956697232 -> node2469957294928 [arrowsize=0.5 minlen=1] node2469957295184 [label="Call plt.savefig(\"out.png\")" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957294928 -> node2469957295184 [arrowsize=0.5 minlen=1] node2469957297552 [label="Attribute plt.savefig" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957295184 -> node2469957297552 [arrowsize=0.5 minlen=1] node2469957297616 [label="Name plt" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957297552 -> node2469957297616 [arrowsize=0.5 minlen=1] node2469956689488 [label="Constant \"out.png\"" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957295184 -> node2469956689488 [arrowsize=0.5 minlen=1] node2469956924432 [label="Assign MONTHS = np.arange(12) + 1" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469938757328 -> node2469956924432 [arrowsize=0.5 minlen=1] node2469957297808 [label="Name MONTHS" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469956924432 -> node2469957297808 [arrowsize=0.5 minlen=1] node2469957298000 [label="BinOp np.arange(12) + 1" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469956924432 -> node2469957298000 [arrowsize=0.5 minlen=1] node2469938354832 [label="Call np.arange(12)" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957298000 -> node2469938354832 [arrowsize=0.5 minlen=1] node2469957298192 [label="Attribute np.arange" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469938354832 -> node2469957298192 [arrowsize=0.5 minlen=1] node2469957298512 [label="Name np" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957298192 -> node2469957298512 [arrowsize=0.5 minlen=1] node2469957300560 [label="Constant 12" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469938354832 -> node2469957300560 [arrowsize=0.5 minlen=1] node2469938592912 [label=Add fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957298000 -> node2469938592912 [arrowsize=0.5 minlen=1] node2469957300752 [label="Constant 1" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957298000 -> node2469957300752 [arrowsize=0.5 minlen=1] node2469957300816 [label="Assign d13, d14 = read('p13.dat'), read('p14.dat')" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469938757328 -> node2469957300816 [arrowsize=0.5 minlen=1] node2469957301264 [label="Tuple d13, d14" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957300816 -> node2469957301264 [arrowsize=0.5 minlen=1] node2469957301712 [label="Name d13" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957301264 -> node2469957301712 [arrowsize=0.5 minlen=1] node2469957303312 [label="Name d14" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957301264 -> node2469957303312 [arrowsize=0.5 minlen=1] node2469957304144 [label="Tuple read('p13.dat'), read('p14.dat')" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957300816 -> node2469957304144 [arrowsize=0.5 minlen=1] node2469957304208 [label="Call read('p13.dat')" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957304144 -> node2469957304208 [arrowsize=0.5 minlen=1] node2469957320784 [label="Name read" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957304208 -> node2469957320784 [arrowsize=0.5 minlen=1] node2469957320848 [label="Constant 'p13.dat'" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957304208 -> node2469957320848 [arrowsize=0.5 minlen=1] node2469957320912 [label="Call read('p14.dat')" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957304144 -> node2469957320912 [arrowsize=0.5 minlen=1] node2469957321040 [label="Name read" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957320912 -> node2469957321040 [arrowsize=0.5 minlen=1] node2469957321104 [label="Constant 'p14.dat'" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957320912 -> node2469957321104 [arrowsize=0.5 minlen=1] node2469957321168 [label="Assign PREC = prec13, prec14 = [], []" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469938757328 -> node2469957321168 [arrowsize=0.5 minlen=1] node2469957321296 [label="Name PREC" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957321168 -> node2469957321296 [arrowsize=0.5 minlen=1] node2469957321488 [label="Tuple prec13, prec14" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957321168 -> node2469957321488 [arrowsize=0.5 minlen=1] node2469957321616 [label="Name prec13" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957321488 -> node2469957321616 [arrowsize=0.5 minlen=1] node2469957321744 [label="Name prec14" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957321488 -> node2469957321744 [arrowsize=0.5 minlen=1] node2469957321936 [label="Tuple [], []" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957321168 -> node2469957321936 [arrowsize=0.5 minlen=1] node2469957322128 [label="List []" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957321936 -> node2469957322128 [arrowsize=0.5 minlen=1] node2469957322320 [label="List []" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957321936 -> node2469957322320 [arrowsize=0.5 minlen=1] node2469957322512 [label="For for i in MONTHS: prec13.append(sum(d13[i])) prec14.append(sum(d14[i]))" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469938757328 -> node2469957322512 [arrowsize=0.5 minlen=1] node2469957322640 [label="Name i" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957322512 -> node2469957322640 [arrowsize=0.5 minlen=1] node2469957322768 [label="Name MONTHS" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957322512 -> node2469957322768 [arrowsize=0.5 minlen=1] node2469957322832 [label="Expr prec13.append(sum(d13[i]))" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957322512 -> node2469957322832 [arrowsize=0.5 minlen=1] node2469957323024 [label="Call prec13.append(sum(d13[i]))" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957322832 -> node2469957323024 [arrowsize=0.5 minlen=1] node2469957323216 [label="Attribute prec13.append" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957323024 -> node2469957323216 [arrowsize=0.5 minlen=1] node2469957323344 [label="Name prec13" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957323216 -> node2469957323344 [arrowsize=0.5 minlen=1] node2469957323472 [label="Call sum(d13[i])" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957323024 -> node2469957323472 [arrowsize=0.5 minlen=1] node2469957323600 [label="Name sum" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957323472 -> node2469957323600 [arrowsize=0.5 minlen=1] node2469957323792 [label=Subscript fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957323472 -> node2469957323792 [arrowsize=0.5 minlen=1] node2469957323856 [label="Name d13" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957323792 -> node2469957323856 [arrowsize=0.5 minlen=1] node2469957323984 [label="Name i" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957323792 -> node2469957323984 [arrowsize=0.5 minlen=1] node2469957324048 [label="Expr prec14.append(sum(d14[i]))" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957322512 -> node2469957324048 [arrowsize=0.5 minlen=1] node2469957324176 [label="Call prec14.append(sum(d14[i]))" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957324048 -> node2469957324176 [arrowsize=0.5 minlen=1] node2469957324368 [label="Attribute prec14.append" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957324176 -> node2469957324368 [arrowsize=0.5 minlen=1] node2469957324496 [label="Name prec14" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957324368 -> node2469957324496 [arrowsize=0.5 minlen=1] node2469957324624 [label="Call sum(d14[i])" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957324176 -> node2469957324624 [arrowsize=0.5 minlen=1] node2469957324752 [label="Name sum" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957324624 -> node2469957324752 [arrowsize=0.5 minlen=1] node2469957324944 [label=Subscript fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957324624 -> node2469957324944 [arrowsize=0.5 minlen=1] node2469957325072 [label="Name d14" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957324944 -> node2469957325072 [arrowsize=0.5 minlen=1] node2469957325200 [label="Name i" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957324944 -> node2469957325200 [arrowsize=0.5 minlen=1] node2469957325264 [label="Expr bar_graph(['2013', '2014'])" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469938757328 -> node2469957325264 [arrowsize=0.5 minlen=1] node2469957325392 [label="Call bar_graph(['2013', '2014'])" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957325264 -> node2469957325392 [arrowsize=0.5 minlen=1] node2469957325520 [label="Name bar_graph" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957325392 -> node2469957325520 [arrowsize=0.5 minlen=1] node2469957325776 [label="List ['2013', '2014']" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957325392 -> node2469957325776 [arrowsize=0.5 minlen=1] node2469957325904 [label="Constant '2013'" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957325776 -> node2469957325904 [arrowsize=0.5 minlen=1] node2469957326032 [label="Constant '2014'" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2] node2469957325776 -> node2469957326032 [arrowsize=0.5 minlen=1] } ```

Additionally, the now diff command now includes an option to display discrepancies between trials, either for activations or definitions. The differences are displayed using the APTED algorithm, which is a state-of-the-art solution for computing tree edit distance—the minimum cost required to transform one tree into another. It supports customizable cost models and input parsing, providing both the tree edit distance value and a node mapping that reflects this distance. To display the output of trial differences in Tree Edit Distance, simply add -d for definition and -a for activations.

now diff 1.1.1 4.1.1 -d
[now] trial diff:
  Start changed from 2024-08-20 23:49:13.536399 to 2024-08-20 23:49:36.048229
  Finish changed from 2024-08-20 23:49:16.173524 to 2024-08-20 23:49:38.974875
  Duration text changed from 0:00:02.637125 to 0:00:02.926646
  Code hash changed from 0ff174577af1057d2e0fcca20cff0aebf0635db9 to 3cfe8af3cc9746bbe88461a4b294cc5c372a22c6
  Sequence key changed from 1 to 5
  Parent id changed from <None> to 05479529-f8d2-4d8c-8ba3-0a8ea851abe3

Definition TED: 25.0

Wrap Up

With this, I’ve wrapped up all my work and successfully completed my Google Summer of Code 2024 project. I’ve implemented several ways to display trial discrepancies.

I also made my final GSoC 2024 report: GitHub Gist Relevant PRs: #165


I sincerely thank my mentor, João Felipe, for his support throughout this project. Working with him was a valuable experience. Thanks also to Ivan Ogasawara for his help, and to Google for the opportunity to contribute through Google Summer of Code.

Thank you for following along with my journey. I look forward to sharing more updates with you soon.

Ending happily with GSoC 2024, Joshua Talahatu