Skip to content

Commit c43c6a5

Browse files
author
Quarto GHA Workflow Runner
committed
Built site for gh-pages
1 parent ea20e77 commit c43c6a5

26 files changed

+116
-104
lines changed

.nojekyll

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
b5fa3cfc
1+
6f8279bd

_tex/index.tex

Lines changed: 12 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -5,9 +5,7 @@
55
%
66
\documentclass[
77
number,
8-
preprint,
9-
3p,
10-
twocolumn]{elsarticle}
8+
preprint]{elsarticle}
119

1210
\usepackage{amsmath,amssymb}
1311
\usepackage{iftex}
@@ -134,26 +132,6 @@
134132
\@ifpackageloaded{caption}{}{\usepackage{caption}}
135133
\@ifpackageloaded{subcaption}{}{\usepackage{subcaption}}
136134
\makeatother
137-
\usepackage{float}
138-
\makeatletter
139-
\let\oldlt\longtable
140-
\let\endoldlt\endlongtable
141-
\def\longtable{\@ifnextchar[\longtable@i \longtable@ii}
142-
\def\longtable@i[#1]{\begin{figure}[H]
143-
\onecolumn
144-
\begin{minipage}{0.5\textwidth}
145-
\oldlt[#1]
146-
}
147-
\def\longtable@ii{\begin{figure}[H]
148-
\onecolumn
149-
\begin{minipage}{0.5\textwidth}
150-
\oldlt
151-
}
152-
\def\endlongtable{\endoldlt
153-
\end{minipage}
154-
\twocolumn
155-
\end{figure}}
156-
\makeatother
157135

158136
\ifLuaTeX
159137
\usepackage{selnolig} % disable illegal ligatures
@@ -396,14 +374,14 @@ \subsection{Earth sciences}\label{earth-sciences}
396374
also emerged, such as the Cloud Optimized GeoTIFF (COG), followed by
397375
Zarr and Apache Parquet for array and tabular data, respectively. In
398376
2006, the Open Source Geospatial Foundation (OSGeo,
399-
https://www.osgeo.org) was established, demonstrating the community's
400-
commitment to the development of open-source geospatial technologies.
401-
While some standards have been developed in the industry (e.g., Keyhole
402-
Markup Language (KML) by Keyhole Inc., which Google later acquired),
403-
they later became international standards of the OGC, which now
404-
encompasses more than 450 commercial, governmental, nonprofit, and
377+
\url{https://www.osgeo.org}) was established, demonstrating the
378+
community's commitment to the development of open-source geospatial
379+
technologies. While some standards have been developed in the industry
380+
(e.g., Keyhole Markup Language (KML) by Keyhole Inc., which Google later
381+
acquired), they later became international standards of the OGC, which
382+
now encompasses more than 450 commercial, governmental, nonprofit, and
405383
research organizations working together on the development and
406-
implementation of open standards (https://www.ogc.org).
384+
implementation of open standards \url{https://www.ogc.org}.
407385

408386
\subsection{Neuroscience}\label{neuroscience}
409387

@@ -441,7 +419,7 @@ \subsection{Community science}\label{community-science}
441419

442420
Another interesting use case for open-source standards is
443421
community/citizen science. An early example of this approach is
444-
OpenStreetMap (https://www.openstreetmap.org), which allows users to
422+
OpenStreetMap \url{https://www.openstreetmap.org}, which allows users to
445423
contribute to the project development with code and data and freely use
446424
the maps and other related geospatial datasets. But this example is not
447425
unique. Overall, this approach has grown in the last 20 years and has
@@ -979,7 +957,7 @@ \subsubsection{Manage Cross Sector
979957
organizations). Similar to program officers at funding agencies,
980958
standards evolution need sustained PM efforts. Multi-party partnerships
981959
should include strategic initiatives for standard establishment such as
982-
the Pistoia Alliance (https://www.pistoiaalliance.org/).
960+
the Pistoia Alliance (\url{https://www.pistoiaalliance.org/}).
983961

984962
\section{Acknowledgements}\label{acknowledgements}
985963

@@ -998,6 +976,8 @@ \section{Acknowledgements}\label{acknowledgements}
998976
in this report do not necessarily reflect those of the National Science
999977
Foundation.
1000978

979+
\newpage
980+
1001981
\section{Appendix: List of
1002982
participants}\label{appendix-list-of-participants}
1003983

_tex/references.bib

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@ @MISC{Van-Tuyl2023-vp
22
title = "Hiring, managing, and retaining data scientists and Research
33
Software Engineers in academia: A career guidebook from {ADSA}
44
and {US}-{RSE}",
5-
author = "Van Tuyl, Steve (ed )",
5+
editor = "Van Tuyl, Steve",
66
doi = {https://doi.org/10.5281/zenodo.8329337},
77
url = {https://zenodo.org/records/8329337},
88
abstract = "The importance of data, software, and computation has long been

index.docx

128 Bytes
Binary file not shown.

index.html

Lines changed: 15 additions & 15 deletions
Large diffs are not rendered by default.

index.pdf

-1.65 KB
Binary file not shown.

sections/01-introduction.embed.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,7 @@
1414
"\n",
1515
"Data and metadata standards that use tools and practices of OSS (“open-source standards” henceforth) reap many of the benefits that the OSS model has provided in the development of other technologies. The present report explores how OSS processes and tools have affected the development of data and metadata standards. The report will survey common features of a variety of use cases; it will identify some of the challenges and pitfalls of this mode of standards development, with a particular focus on cross-sector interactions; and it will make recommendations for future developments and policies that can help this mode of standards development thrive and reach its full potential."
1616
],
17-
"id": "de5f3a24-ae5b-4fb3-bb3c-5093d76da04f"
17+
"id": "66886525-7816-4a15-880e-c96685330644"
1818
}
1919
],
2020
"nbformat": 4,

sections/01-introduction.out.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@
1818
"\n",
1919
"Wilkinson, Mark D, Michel Dumontier, I Jsbrand Jan Aalbersberg, Gabrielle Appleton, Myles Axton, Arie Baak, Niklas Blomberg, et al. 2016. “The FAIR Guiding Principles for Scientific Data Management and Stewardship.” *Sci Data* 3 (March): 160018."
2020
],
21-
"id": "7cc30035-220f-4f9b-9679-16fae38daefe"
21+
"id": "9f907562-5ea8-4a19-ac81-469cb647d23e"
2222
}
2323
],
2424
"nbformat": 4,

sections/02-use-cases-preview.html

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -191,15 +191,15 @@ <h2 class="anchored" data-anchor-id="high-energy-physics-hep">High-energy physic
191191
</section>
192192
<section id="earth-sciences" class="level2">
193193
<h2 class="anchored" data-anchor-id="earth-sciences">Earth sciences</h2>
194-
<p>The need for geospatial data exchange between different systems began to be recognized in the 1970s and 1980s, but proprietary formats still dominated. Coordinated standardization efforts brought the Open Geospatial Consortium (OGC) establishment in the 1990s, a critical step towards open standards for geospatial data. The 1990s have also seen the development of key standards such as the Network Common Data Form (NetCDF) developed by the University Corporation for Atmospheric Research (UCAR), and the Hierarchical Data Format (HDF), a set of file formats (HDF4, HDF5) that are widely used, particularly in climate research. The GeoTIFF format, which originated at NASA in the late 1990s, is extensively used to share image data. The following two decades, the 2000s-2020s, brought an expansion of open standards and integration with web technologies developed by OGC, as well as other standards such as the Keyhole Markup Language (KML) for displaying geographic data in Earth browsers. Formats suitable for cloud computing also emerged, such as the Cloud Optimized GeoTIFF (COG), followed by Zarr and Apache Parquet for array and tabular data, respectively. In 2006, the Open Source Geospatial Foundation (OSGeo, https://www.osgeo.org) was established, demonstrating the community’s commitment to the development of open-source geospatial technologies. While some standards have been developed in the industry (e.g., Keyhole Markup Language (KML) by Keyhole Inc., which Google later acquired), they later became international standards of the OGC, which now encompasses more than 450 commercial, governmental, nonprofit, and research organizations working together on the development and implementation of open standards (https://www.ogc.org).</p>
194+
<p>The need for geospatial data exchange between different systems began to be recognized in the 1970s and 1980s, but proprietary formats still dominated. Coordinated standardization efforts brought the Open Geospatial Consortium (OGC) establishment in the 1990s, a critical step towards open standards for geospatial data. The 1990s have also seen the development of key standards such as the Network Common Data Form (NetCDF) developed by the University Corporation for Atmospheric Research (UCAR), and the Hierarchical Data Format (HDF), a set of file formats (HDF4, HDF5) that are widely used, particularly in climate research. The GeoTIFF format, which originated at NASA in the late 1990s, is extensively used to share image data. The following two decades, the 2000s-2020s, brought an expansion of open standards and integration with web technologies developed by OGC, as well as other standards such as the Keyhole Markup Language (KML) for displaying geographic data in Earth browsers. Formats suitable for cloud computing also emerged, such as the Cloud Optimized GeoTIFF (COG), followed by Zarr and Apache Parquet for array and tabular data, respectively. In 2006, the Open Source Geospatial Foundation (OSGeo, <a href="https://www.osgeo.org">https://www.osgeo.org</a>) was established, demonstrating the community’s commitment to the development of open-source geospatial technologies. While some standards have been developed in the industry (e.g., Keyhole Markup Language (KML) by Keyhole Inc., which Google later acquired), they later became international standards of the OGC, which now encompasses more than 450 commercial, governmental, nonprofit, and research organizations working together on the development and implementation of open standards <a href="https://www.ogc.org">https://www.ogc.org</a>.</p>
195195
</section>
196196
<section id="neuroscience" class="level2">
197197
<h2 class="anchored" data-anchor-id="neuroscience">Neuroscience</h2>
198198
<p>In contrast to the previously-mentioned fields, Neuroscience has traditionally been a “cottage industry”, where individual labs have generated experimental data designed to answer specific experimental questions. While this model still exists, the field has also seen the emergence of new modes of data production that focus on generating large shared datasets designed to answer many different questions, more akin to the data generated in large astronomy data collection efforts <span class="citation" data-cites="Koch2012-ve">(<a href="#ref-Koch2012-ve" role="doc-biblioref">Koch and Clay Reid 2012</a>)</span>. This change has been brought on through a combination of technical advances in data acquisition techniques, which now generate large and very high-dimensional/information-rich datasets, cultural changes, which have ushered in new norms of transparency and reproducibility, and funding initiatives that have encouraged this kind of data collection. However, because these changes are recent relative to the other cases mentioned above, standards for data and metadata in neuroscience have been prone to adopt many elements of modern OSS development. Two salient examples in neuroscience are the Neurodata Without Borders file format for neurophysiology data <span class="citation" data-cites="Rubel2022NWB">(<a href="#ref-Rubel2022NWB" role="doc-biblioref">Rübel et al. 2022</a>)</span> and the Brain Imaging Data Structure (BIDS) standard for neuroimaging data <span class="citation" data-cites="Gorgolewski2016BIDS">(<a href="#ref-Gorgolewski2016BIDS" role="doc-biblioref">Gorgolewski et al. 2016</a>)</span>. BIDS in particular owes some of its success to the adoption of OSS development mechanisms <span class="citation" data-cites="Poldrack2024BIDS">(<a href="#ref-Poldrack2024BIDS" role="doc-biblioref">Poldrack et al. 2024</a>)</span>. For example, small changes to the standard are managed through the GitHub pull request mechanism; larger changes are managed through a BIDS Enhancement Proposal (BEP) process that is directly inspired by the Python programming language community’s Python Enhancement Proposal procedure, which is used to introduce new ideas into the language. Though the BEP mechanism takes a slightly different technical approach, it tries to emulate the open-ended and community-driven aspects of Python development to accept contributions from a wide range of stakeholders and tap a broad base of expertise.</p>
199199
</section>
200200
<section id="community-science" class="level2">
201201
<h2 class="anchored" data-anchor-id="community-science">Community science</h2>
202-
<p>Another interesting use case for open-source standards is community/citizen science. An early example of this approach is OpenStreetMap (https://www.openstreetmap.org), which allows users to contribute to the project development with code and data and freely use the maps and other related geospatial datasets. But this example is not unique. Overall, this approach has grown in the last 20 years and has been adopted in many different fields. It has many benefits for both the research field that harnesses the energy of non-scientist members of the community to engage with scientific data, as well as to the community members themselves who can draw both knowledge and pride in their participation in the scientific endeavor. It is also recognized that unique broader benefits are accrued from this mode of scientific research, through the inclusion of perspectives and data that would not otherwise be included. To make data accessible to community scientists, and to make the data collected by community scientists accessible to professional scientists, it needs to be provided in a manner that can be created and accessed without specialized instruments or specialized knowledge. Here, standards are needed to facilitate interactions between an in-group of expert researchers who generate and curate data and a broader set of out-group enthusiasts who would like to make meaningful contributions to the science. This creates a particularly stringent constraint on transparency and simplicity of standards. Creating these standards in a manner that addresses these unique constraints can benefit from OSS tools, with the caveat that some of these tools require additional expertise. For example, if the standard is developed using git/GitHub for versioning, this would require learning the complex and obscure technical aspects of these system that are far from easy to adopt, even for many professional scientists.</p>
202+
<p>Another interesting use case for open-source standards is community/citizen science. An early example of this approach is OpenStreetMap <a href="https://www.openstreetmap.org">https://www.openstreetmap.org</a>, which allows users to contribute to the project development with code and data and freely use the maps and other related geospatial datasets. But this example is not unique. Overall, this approach has grown in the last 20 years and has been adopted in many different fields. It has many benefits for both the research field that harnesses the energy of non-scientist members of the community to engage with scientific data, as well as to the community members themselves who can draw both knowledge and pride in their participation in the scientific endeavor. It is also recognized that unique broader benefits are accrued from this mode of scientific research, through the inclusion of perspectives and data that would not otherwise be included. To make data accessible to community scientists, and to make the data collected by community scientists accessible to professional scientists, it needs to be provided in a manner that can be created and accessed without specialized instruments or specialized knowledge. Here, standards are needed to facilitate interactions between an in-group of expert researchers who generate and curate data and a broader set of out-group enthusiasts who would like to make meaningful contributions to the science. This creates a particularly stringent constraint on transparency and simplicity of standards. Creating these standards in a manner that addresses these unique constraints can benefit from OSS tools, with the caveat that some of these tools require additional expertise. For example, if the standard is developed using git/GitHub for versioning, this would require learning the complex and obscure technical aspects of these system that are far from easy to adopt, even for many professional scientists.</p>
203203
<div id="refs" class="references csl-bib-body hanging-indent" data-entry-spacing="0" role="list">
204204
<div id="ref-Basaglia2023-dq" class="csl-entry" role="listitem">
205205
Basaglia, T, M Bellis, J Blomer, J Boyd, C Bozzi, D Britzger, S Campana, et al. 2023. <span>“Data Preservation in High Energy Physics.”</span> <em>The European Physical Journal C</em> 83 (9): 795.

0 commit comments

Comments
 (0)