You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In this tutorial we will introduce GitHub Actions to scientists as a tool for lightweight automation of scientific data workflows. We will
11
11
demonstrate that GitHub Actions are not just a tool for software testing, but can be used in various ways to improve the reproducibility
12
-
and impact of scientific analysis. Through a sequence of examples, we will demonstrate some of Github Actions' applications to scientific
12
+
and impact of scientific analysis. Through a sequence of examples, we will demonstrate some of GitHub Actions' applications to scientific
13
13
workflows, such as scheduled deployment of algorithms to sensor streams, updating visualizations based on new data, processing large
14
14
datasets, model versioning and performance benchmarking. GitHub Actions can particularly empower Python scientific programmers who are not
15
15
willing to build fully-fledged applications or set up complex computational infrastructure, but would like to increase the impact of their
16
16
work. The goal is that participants will leave with their own ideas of how to integrate Github Actions in their own work.
17
17
18
-
## Description:
18
+
## Description
19
19
20
20
GitHub Actions are quite popular within the software engineering community, but a scientific Python programmer may not have seen their use
21
21
beyond a continuous integration framework for unit testing. We would like to increase their visibility through a scientific workflow lens.
22
22
We will use examples that are relevant to the community: wrangling a messy realtime hydrophone data stream to display noise sounds from the
23
23
Puget Sound (not far from the conference venue!) or processing hundreds of satellite radar images over glacial lakes in High-Mountain Asia
24
-
to study flood hazards. We assume no knowledge on Github Actions and will start slowly with a “Hello World” step, but build quickly to
24
+
to study flood hazards. We assume no knowledge on GitHub Actions and will start slowly with a “Hello World” step, but build quickly to
25
25
create complex and exciting workflows. We will also showcase their value for scientific collaborations across institutions as a means to
26
26
share reproducible workflows and computing infrastructure.
27
27
28
-
## Prerequisites:
29
-
GitHub account, familiarity with git, GitHub, and Python (conda, scipy, matplotlib), some maturity in manipulating scientific data and
28
+
## Prerequisites
29
+
GitHub account, familiarity with git (commits, versioning), GitHub (push, pull requests), and Python (conda, scipy, matplotlib), some maturity in manipulating scientific data and
30
30
exposure to the challenges associated with it, ability to read code (our examples may use libraries not familiar to the audience, but the
31
31
focus will be on the steps these libraries accomplish rather than the details)
32
32
33
-
## Installation Instructions:
33
+
## Installation Instructions
34
34
Participants can make edits from the GitHub interface, but if they are willing to make updates locally, they need to have a functioning git
35
35
([set up instructions](https://swcarpentry.github.io/git-novice/#installing-git))
36
36
37
37
## Outline
38
38
39
+
## Short Version
39
40
```{tableofcontents}
40
41
```
41
42
43
+
## Long Version (with approximate schedule)
42
44
* Overview of GitHub Actions and Workflows and their popular uses in Python software development (examples of testing, listing,
43
-
packaging)(30 min)
45
+
packaging)(20 min)
44
46
* We will explain the main components of GitHub Actions and associated terminology
45
47
* We will summarize their typical uses in software development
46
48
* We will point to popular GitHub Actions used in Python software development and packaging (the focus of this tutorial will not be
@@ -54,36 +56,36 @@ on them but rather on scientific pipelines)
54
56
* we will deploy a typical scientific workflow: reading data, converting to a new format, and making a visualization
55
57
* participants will update the deployment schedule to trigger a new workflow and will monitor the progress in the GitHub interface
56
58
57
-
* Break (10 min)
59
+
* Break (15 min)
58
60
59
61
* Exporting results (30 min)
60
62
* participants will learn about various ways to store the results:
61
63
* caching
62
-
* creating GitHub artifacts
63
64
* committing to GitHub
64
-
* storing to own storage
65
-
* they will modify the code to make their own plot which will be automatically updated
65
+
* creating GitHub artifacts
66
+
* storing to personal storage
67
+
* they will modify the code to make a new plot which will be automatically updated
66
68
* they will use either matplotlib or an interactive library such as plotly
67
69
68
70
* Update results on a webpage (30 min)
69
71
* we will overview different ways to display scientific results on a webpage
70
72
* we will demonstrate the workflow to deploy the webpage
71
73
* participants will rerender the webpage based on the updates in GitHub
72
74
73
-
* Large-scale data processing (30 min)
74
-
* we will demonstrate a use-case of processing large data sets with Github Actions
75
+
* Large-scale data processing (45 min)
76
+
* we will demonstrate a use-case of processing large data sets with GitHub Actions
75
77
* participants will fiddle with problem size to understand the power and limits of the computational infrastructure
76
78
* we will discuss connections to cluster/cloud computing
77
79
78
80
* Break (10 min)
79
81
80
-
* Model Versioning and Comparison (30 min)
82
+
* Model Versioning and Benchmarking (20 min)
81
83
* we will introduce how to leverage GitHub’s version control to version different models and performance
82
84
* participants can contribute a new model and check its performance
83
85
* we will discuss how this can be used as a community network to share methods and results
84
86
85
87
* Recap and Discussion (or buffer time) (20 min)
86
-
* we will have a discussion on potential uses of Github Actions within the work of the participants
88
+
* we will have a discussion on potential uses of GitHub Actions within the work of the participants
0 commit comments