Introduction to pyIBA#
Ion Beam Analysis (IBA) is a powerful suite of techniques, widely adopted in fields like materials science, nuclear physics, and biology. However, despite its utility, the management and analysis of IBA data can often require specialised software and expertise.
In today’s data-driven scientific landscape, managing and sharing data efficiently is pivotal. The recent introduction of the Ion Beam Data Format (IDF) seeks to meet this need in the IBA community. The IDF’s main objective is to streamline the storing, sharing, and processing of IBA data, aligning well with contemporary research practices like FAIR (Findable, Accessible, Interoperable, and Reusable). A universal file format is essential for enhancing collaboration between labs and ensuring scientific transparency. This is especially crucial in significant European projects like RADIATE EU Project and the newly introduced ReMade@ARI EU Project, where vast amounts of data are generated and spread across various facilities.
To boost the utility of IDF files and simplify IBA data analysis and management, we developed pyIBA, an open-source Python library.
pyIBA: The Pythonic Answer to IBA#
pyIBA offers a suite of methods to create, edit, process, and analyse IDF files. Based on Object-Oriented Programming principles, it presents a clear and intuitive syntax, making it not just powerful, but also user-friendly. It integrates seamlessly with other established Python scientific libraries, such as Numpy, Matplotlib, and TensorFlow. Moreover, it’s tailor-made for use with Jupyter Notebooks, combining interactive coding with the notebook’s unique logging capabilities.
In addition to all its core features, pyIBA also supports NDF, a recognised analysis code in the IBA community. NDF’s ability to perform simultaneous analysis of data from multiple experiments and techniques is noteworthy. Leveraging the IDF format’s capabilities, all related details – from parameters and models to results – can be stored alongside experimental data in a single IDF file. This unified approach ensures that all relevant information is readily accessible in one place.
Moreover, the modular design approach of pyIBA promises easy future integrations, leaving the door open for more codes and tools in addition to NDF.
We have also developed IBA Studio, a graphic user interface based on pyIBA. This interface allows for quick viewing, editing, and analysis of IDF data, especially catering to users who might be less inclined toward coding.
For further information on pyIBA, head to:
Why adopt a standardised format? for more information on the IDF format
Installation page to start using IDF
API for a full list of the methods in each class, their documentation and some examples of their use.
Basic Examples for everyday examples and an introduction to the pyIBA environment
Advanced Examples for more advanced examples and how to run NDF from pyIBA
Note
This project is under active development. Any contribution is welcome!
Why adopt a standardised format?#
As of now, when an ion beam scientist leaves the measurement station, he is forced to carry the data associated with the experiment in several formats and mediums:
the spectra data is likely stored on some type of hardware or cloud storage;
the experimental details are written on a paper notebook (sometimes in a digital one)
other details, such as those related to the sample or the type of measurements already made, are often stored in a series of e-mails;
information on how to link the above three points, and that extra bit of “not so important” information, is commonly trusted in each one’s memory.
Afterwards, the ion beam scientist goes to his desk to analyse what he just did. The first step should be to gather all the information and format it into some specific analysis method. This can be harder than it looks, in particular, if a considerable time has passed between the experiment and the analysis. Among the issues one might face, the more prominent are likely:
knowing where the files were stored;
remembering in which entry and in what logbook concerns the experiment;
doing some “e-mail archaeology” (to quote a veteran Ion Beam Scientist);
asking why we thought the above data storage locations would never be forgotten.
After battling the above issues, it is time to share the work. The scientist should compile, in the smaller number of files possible, the experimental process and raw data, the analysis methods and outputs and, ideally, the considerations taken in every step. This is perhaps where the current workflow fails the most. Since there is no agreed method to compile this, confusion is frequently set between collaborative groups leading to an eventual loss of information. To avoid this, the careful scientist will write long reports or e-mails that may not be definitive. Alternatively, group members exchange numerous e-mails or schedule zoom meetings. Either method takes an ever-increasing amount of time and energy and leaves one of the most important scientific principles unchecked - the results are hardly reproducible or will at least take as much effort as the initial try.
After substantial sharing and consequent feedback, either by collaborative groups or through the peer-reviewer process, a return to the accelerator is frequently needed. This return multiplicates all of the issues described until now, as illustrated in the figure below. The data files, the logbook entries, the number of e-mails and the kB stored in our memory all increase with each loop of acquisition-share. The complexity can rapidly become chaotic (“an order yet undeciphered”) when the person returning to the accelerator is not the original one (as usually happens in groups with students).
At this point, serious loss of information may incur. The what, why, who, when and how behind each experiment can easily be “lost in translation”. At best, this slows the scientific process. At worst, it leads to wrong conclusions.
Therefore, a change in the management of data is urgently needed in the IBA community. In this context, the IDF (IBA Data Format), a XML schema, has been accepted by the RADIATE EU Project members as the standardised data format for IBA. However, until now, the format has seen little adoption by the community, partially because handling XML files is a tedious and time-consuming task. This makes updating the existing data acquisition and analysis software an unattractive investment, even though there are obvious gains in having a unified data format.
Instead of having information scattered through multiple locations, we leave the accelerator with a single IDF file. This file contains the spectra produced during the experiment, the experimental parameters of each measurement, the sample details and any event that might have occurred.
When the analysis task arrives, the IDF file should be directly accepted by the analysis tools, both on the input and output sides of the codes. This means that not only the outputs but also the analysis methods and considerations are saved in the same IDF file.
When sharing, the scientist only has one single file to send. This is an obvious advantage and simplifies greatly the entire sharing feedback loop. The analysis can be readily reproduced either using the same analysis code or another one (and saved on the same file). Furthermore, returning to the accelerator will not increase the complexity in the same manner as described above. Any new data produced can be saved in the original file using the same standarised method.
In summa, the entire workflow can be greatly expedited.
Table of Contents#
- Introduction to pyIBA
- Why adopt a standardised format?
- Installation
- Examples
- Basic Examples
- Advanced Examples
- A1 - Large IDF file with experimental and NDF fitted RBS-NRA-PIXE-SIMS data
- A2 - Opening IBA Studio from jupyter
- A3 - NDF and IDF files
- Fitting multiple spectra in the same IDF
- A4 - pyIBA and Tensorflow: Machine Learning demo
- Create the IDF file
- Run NDF
- Machine Learning
- A5 - Creating IDF files from NDF files
- API