Finalizing GSoC: AST Output and Project Wrap-Up

Introduced AST output features and wrapped up the GSoC 2024 project with final enhancements and documentation.


Additional Feature

This week, I've introduced a significant new feature to align with the project goals. The now ast command now allows access to the trial's Abstract Syntax Tree (AST). By default, the output is a string generated by ast.dump that is based of trial's definition rather than the script itself, as shown below:

now ast 1.1.1 
Module(body=[Import(names=[alias(name='numpy', asname='np')]), Import(names=[alias(name='matplotlib.pyplot', asname='plt')]), ImportFrom(module='precipitation', names=[alias(name='read'), alias(name='prepare')], level=0), FunctionDef(name='bar_graph', args=arguments(posonlyargs=[], args=[arg(arg='years')], kwonlyargs=[], kw_defaults=[], defaults=[]), body=[Global(names=['PREC', ' MONTHS']), Expr(value=Call(func=Name(id='prepare', ctx=Load()), args=[Name(id='PREC', ctx=Load()), Name(id='MONTHS', ctx=Load()), Name(id='years', ctx=Load()), Name(id='plt', ctx=Load())], keywords=[])), Expr(value=Call(func=Attribute(value=Name(id='plt', ctx=Load()), attr='savefig', ctx=Load()), args=[Constant(value='"out.png"')], keywords=[]))], decorator_list=[], type_params=[]), Assign(targets=[Name(id='MONTHS', ctx=Load())], value=BinOp(left=Call(func=Attribute(value=Name(id='np', ctx=Load()), attr='arange', ctx=Load()), args=[Constant(value='12')], 
keywords=[]), op=Add(), right=Constant(value='1'))), Assign(targets=[Tuple(elts=[Name(id='d13', ctx=Load()), Name(id='d14', ctx=Load())], ctx=Load())], value=Tuple(elts=[Call(func=Name(id='read', ctx=Load()), args=[Constant(value="'p13.dat'")], keywords=[]), Call(func=Name(id='read', ctx=Load()), args=[Constant(value="'p14.dat'")], keywords=[])], ctx=Load())), Assign(targets=[Name(id='PREC', ctx=Load()), Tuple(elts=[Name(id='prec13', ctx=Load()), Name(id='prec14', ctx=Load())], ctx=Load())], value=Tuple(elts=[List(elts=[], ctx=Load()), List(elts=[], ctx=Load())], ctx=Load())), For(target=Name(id='i', ctx=Load()), iter=Name(id='MONTHS', ctx=Load()), body=[Expr(value=Call(func=Attribute(value=Name(id='prec13', ctx=Load()), attr='append', ctx=Load()), args=[Call(func=Name(id='sum', ctx=Load()), args=[Subscript(value=Name(id='d13', ctx=Load()), slice=Name(id='i', ctx=Load()), ctx=Load())], keywords=[])], keywords=[])), Expr(value=Call(func=Attribute(value=Name(id='prec14', ctx=Load()), attr='append', ctx=Load()), args=[Call(func=Name(id='sum', ctx=Load()), args=[Subscript(value=Name(id='d14', ctx=Load()), slice=Name(id='i', ctx=Load()), ctx=Load())], keywords=[])], 
keywords=[]))], orelse=[]), Expr(value=Call(func=Name(id='bar_graph', ctx=Load()), args=[List(elts=[Constant(value="'2013'"), Constant(value="'2014'")], ctx=Load())], keywords=[]))], type_ignores=[])

There are also two additional output options: JSON (integrated into now ast from Week 11) and Graphviz DOT format. The DOT format uses the Graphviz library to build a structure that represents the trial. This particular DOT graph was developed during Weeks 3-6, but now outputs in DOT instead of SVG or PNG. To retrieve the trial's AST in DOT format, simply add -d.

Preview dot output
now ast 1.1.1 -d
digraph {
        nodesep=0.75 rankdir=TB ranksep=0.75
        node2469938757328 [label="Module
experiment.py" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469938836304 [label="Import
import numpy" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469938757328 -> node2469938836304 [arrowsize=0.5 minlen=1]
        node2469955993552 [label="alias
numpy as np" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469938836304 -> node2469955993552 [arrowsize=0.5 minlen=1]
        node2469957251472 [label="Import
import matplotlib.pyplot" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469938757328 -> node2469957251472 [arrowsize=0.5 minlen=1]
        node2469938506640 [label="alias
matplotlib.pyplot as plt" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957251472 -> node2469938506640 [arrowsize=0.5 minlen=1]
        node2469957252048 [label="ImportFrom
from precipitation import read, prepare" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469938757328 -> node2469957252048 [arrowsize=0.5 minlen=1]
        node2469957247632 [label="alias
read" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957252048 -> node2469957247632 [arrowsize=0.5 minlen=1]
        node2469957252112 [label="alias
prepare" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957252048 -> node2469957252112 [arrowsize=0.5 minlen=1]
        node2469956697232 [label="FunctionDef
def bar_graph(years):
global PREC, MONTHS
prepare(PREC, MONTHS, years, plt)
plt.savefig(\"out.png\")" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469938757328 -> node2469956697232 [arrowsize=0.5 minlen=1]
        node2469957292432 [label=arguments fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469956697232 -> node2469957292432 [arrowsize=0.5 minlen=1]
        node2469957113680 [label="arg
years" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957292432 -> node2469957113680 [arrowsize=0.5 minlen=1]
        node2469956256848 [label="Global
global PREC, MONTHS" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469956697232 -> node2469956256848 [arrowsize=0.5 minlen=1]
        node2469957292176 [label="Expr
prepare(PREC, MONTHS, years, plt)" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469956697232 -> node2469957292176 [arrowsize=0.5 minlen=1]
        node2469957293776 [label="Call
prepare(PREC, MONTHS, years, plt)" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957292176 -> node2469957293776 [arrowsize=0.5 minlen=1]
        node2469957292304 [label="Name
prepare" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957293776 -> node2469957292304 [arrowsize=0.5 minlen=1]
        node2469957291856 [label="Name
PREC" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957293776 -> node2469957291856 [arrowsize=0.5 minlen=1]
        node2469957292816 [label="Name
MONTHS" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957293776 -> node2469957292816 [arrowsize=0.5 minlen=1]
        node2469957294352 [label="Name
years" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957293776 -> node2469957294352 [arrowsize=0.5 minlen=1]
        node2469957294864 [label="Name
plt" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957293776 -> node2469957294864 [arrowsize=0.5 minlen=1]
        node2469957294928 [label="Expr
plt.savefig(\"out.png\")" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469956697232 -> node2469957294928 [arrowsize=0.5 minlen=1]
        node2469957295184 [label="Call
plt.savefig(\"out.png\")" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957294928 -> node2469957295184 [arrowsize=0.5 minlen=1]
        node2469957297552 [label="Attribute
plt.savefig" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957295184 -> node2469957297552 [arrowsize=0.5 minlen=1]
        node2469957297616 [label="Name
plt" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957297552 -> node2469957297616 [arrowsize=0.5 minlen=1]
        node2469956689488 [label="Constant
\"out.png\"" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957295184 -> node2469956689488 [arrowsize=0.5 minlen=1]
        node2469956924432 [label="Assign
MONTHS = np.arange(12) + 1" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469938757328 -> node2469956924432 [arrowsize=0.5 minlen=1]
        node2469957297808 [label="Name
MONTHS" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469956924432 -> node2469957297808 [arrowsize=0.5 minlen=1]
        node2469957298000 [label="BinOp
np.arange(12) + 1" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469956924432 -> node2469957298000 [arrowsize=0.5 minlen=1]
        node2469938354832 [label="Call
np.arange(12)" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957298000 -> node2469938354832 [arrowsize=0.5 minlen=1]
        node2469957298192 [label="Attribute
np.arange" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469938354832 -> node2469957298192 [arrowsize=0.5 minlen=1]
        node2469957298512 [label="Name
np" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957298192 -> node2469957298512 [arrowsize=0.5 minlen=1]
        node2469957300560 [label="Constant
12" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469938354832 -> node2469957300560 [arrowsize=0.5 minlen=1]
        node2469938592912 [label=Add fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957298000 -> node2469938592912 [arrowsize=0.5 minlen=1]
        node2469957300752 [label="Constant
1" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957298000 -> node2469957300752 [arrowsize=0.5 minlen=1]
        node2469957300816 [label="Assign
d13, d14 = read('p13.dat'), read('p14.dat')" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469938757328 -> node2469957300816 [arrowsize=0.5 minlen=1]
        node2469957301264 [label="Tuple
d13, d14" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957300816 -> node2469957301264 [arrowsize=0.5 minlen=1]
        node2469957301712 [label="Name
d13" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957301264 -> node2469957301712 [arrowsize=0.5 minlen=1]
        node2469957303312 [label="Name
d14" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957301264 -> node2469957303312 [arrowsize=0.5 minlen=1]
        node2469957304144 [label="Tuple
read('p13.dat'), read('p14.dat')" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957300816 -> node2469957304144 [arrowsize=0.5 minlen=1]
        node2469957304208 [label="Call
read('p13.dat')" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957304144 -> node2469957304208 [arrowsize=0.5 minlen=1]
        node2469957320784 [label="Name
read" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957304208 -> node2469957320784 [arrowsize=0.5 minlen=1]
        node2469957320848 [label="Constant
'p13.dat'" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957304208 -> node2469957320848 [arrowsize=0.5 minlen=1]
        node2469957320912 [label="Call
read('p14.dat')" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957304144 -> node2469957320912 [arrowsize=0.5 minlen=1]
        node2469957321040 [label="Name
read" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957320912 -> node2469957321040 [arrowsize=0.5 minlen=1]
        node2469957321104 [label="Constant
'p14.dat'" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957320912 -> node2469957321104 [arrowsize=0.5 minlen=1]
        node2469957321168 [label="Assign
PREC = prec13, prec14 = [], []" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469938757328 -> node2469957321168 [arrowsize=0.5 minlen=1]
        node2469957321296 [label="Name
PREC" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957321168 -> node2469957321296 [arrowsize=0.5 minlen=1]
        node2469957321488 [label="Tuple
prec13, prec14" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957321168 -> node2469957321488 [arrowsize=0.5 minlen=1]
        node2469957321616 [label="Name
prec13" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957321488 -> node2469957321616 [arrowsize=0.5 minlen=1]
        node2469957321744 [label="Name
prec14" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957321488 -> node2469957321744 [arrowsize=0.5 minlen=1]
        node2469957321936 [label="Tuple
[], []" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957321168 -> node2469957321936 [arrowsize=0.5 minlen=1]
        node2469957322128 [label="List
[]" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957321936 -> node2469957322128 [arrowsize=0.5 minlen=1]
        node2469957322320 [label="List
[]" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957321936 -> node2469957322320 [arrowsize=0.5 minlen=1]
        node2469957322512 [label="For
for i in MONTHS:
    prec13.append(sum(d13[i]))
    prec14.append(sum(d14[i]))" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469938757328 -> node2469957322512 [arrowsize=0.5 minlen=1]
        node2469957322640 [label="Name
i" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957322512 -> node2469957322640 [arrowsize=0.5 minlen=1]
        node2469957322768 [label="Name
MONTHS" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957322512 -> node2469957322768 [arrowsize=0.5 minlen=1]
        node2469957322832 [label="Expr
prec13.append(sum(d13[i]))" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957322512 -> node2469957322832 [arrowsize=0.5 minlen=1]
        node2469957323024 [label="Call
prec13.append(sum(d13[i]))" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957322832 -> node2469957323024 [arrowsize=0.5 minlen=1]
        node2469957323216 [label="Attribute
prec13.append" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957323024 -> node2469957323216 [arrowsize=0.5 minlen=1]
        node2469957323344 [label="Name
prec13" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957323216 -> node2469957323344 [arrowsize=0.5 minlen=1]
        node2469957323472 [label="Call
sum(d13[i])" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957323024 -> node2469957323472 [arrowsize=0.5 minlen=1]
        node2469957323600 [label="Name
sum" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957323472 -> node2469957323600 [arrowsize=0.5 minlen=1]
        node2469957323792 [label=Subscript fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957323472 -> node2469957323792 [arrowsize=0.5 minlen=1]
        node2469957323856 [label="Name
d13" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957323792 -> node2469957323856 [arrowsize=0.5 minlen=1]
        node2469957323984 [label="Name
i" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957323792 -> node2469957323984 [arrowsize=0.5 minlen=1]
        node2469957324048 [label="Expr
prec14.append(sum(d14[i]))" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957322512 -> node2469957324048 [arrowsize=0.5 minlen=1]
        node2469957324176 [label="Call
prec14.append(sum(d14[i]))" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957324048 -> node2469957324176 [arrowsize=0.5 minlen=1]
        node2469957324368 [label="Attribute
prec14.append" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957324176 -> node2469957324368 [arrowsize=0.5 minlen=1]
        node2469957324496 [label="Name
prec14" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957324368 -> node2469957324496 [arrowsize=0.5 minlen=1]
        node2469957324624 [label="Call
sum(d14[i])" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957324176 -> node2469957324624 [arrowsize=0.5 minlen=1]
        node2469957324752 [label="Name
sum" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957324624 -> node2469957324752 [arrowsize=0.5 minlen=1]
        node2469957324944 [label=Subscript fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957324624 -> node2469957324944 [arrowsize=0.5 minlen=1]
        node2469957325072 [label="Name
d14" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957324944 -> node2469957325072 [arrowsize=0.5 minlen=1]
        node2469957325200 [label="Name
i" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957324944 -> node2469957325200 [arrowsize=0.5 minlen=1]
        node2469957325264 [label="Expr
bar_graph(['2013', '2014'])" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469938757328 -> node2469957325264 [arrowsize=0.5 minlen=1]
        node2469957325392 [label="Call
bar_graph(['2013', '2014'])" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957325264 -> node2469957325392 [arrowsize=0.5 minlen=1]
        node2469957325520 [label="Name
bar_graph" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957325392 -> node2469957325520 [arrowsize=0.5 minlen=1]
        node2469957325776 [label="List
['2013', '2014']" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957325392 -> node2469957325776 [arrowsize=0.5 minlen=1]
        node2469957325904 [label="Constant
'2013'" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957325776 -> node2469957325904 [arrowsize=0.5 minlen=1]
        node2469957326032 [label="Constant
'2014'" fontsize=10 height=0.2 margin="0.1,0.1" width=0.2]
        node2469957325776 -> node2469957326032 [arrowsize=0.5 minlen=1]
}

Additionally, the now diff command now includes an option to display discrepancies between trials, either for activations or definitions. The differences are displayed using the APTED algorithm, which is a state-of-the-art solution for computing tree edit distance—the minimum cost required to transform one tree into another. It supports customizable cost models and input parsing, providing both the tree edit distance value and a node mapping that reflects this distance. To display the output of trial differences in Tree Edit Distance, simply add -d for definition and -a for activations.

now diff 1.1.1 4.1.1 -d
[now] trial diff:
  Start changed from 2024-08-20 23:49:13.536399 to 2024-08-20 23:49:36.048229
  Finish changed from 2024-08-20 23:49:16.173524 to 2024-08-20 23:49:38.974875
  Duration text changed from 0:00:02.637125 to 0:00:02.926646
  Code hash changed from 0ff174577af1057d2e0fcca20cff0aebf0635db9 to 3cfe8af3cc9746bbe88461a4b294cc5c372a22c6
  Sequence key changed from 1 to 5
  Parent id changed from <None> to 05479529-f8d2-4d8c-8ba3-0a8ea851abe3
 
Definition TED: 25.0

Wrap Up

With this, I’ve wrapped up all my work and successfully completed my Google Summer of Code 2024 project. I’ve implemented several ways to display trial discrepancies.

I also made my final GSoC 2024 report: GitHub Gist Relevant PRs: #165


I sincerely thank my mentor, João Felipe, for his support throughout this project. Working with him was a valuable experience. Thanks also to Ivan Ogasawara for his help, and to Google for the opportunity to contribute through Google Summer of Code.

Thank you for following along with my journey. I look forward to sharing more updates with you soon.

Ending happily with GSoC 2024, Joshua Talahatu