Skip to content
Snippets Groups Projects
Unverified Commit b7923006 authored by AJRubio-Montero's avatar AJRubio-Montero Committed by GitHub
Browse files

Update README.md

parent a35630ad
No related branches found
No related tags found
No related merge requests found
...@@ -55,29 +55,48 @@ established. For this purpose, onedataSim includes two main programs: ...@@ -55,29 +55,48 @@ established. For this purpose, onedataSim includes two main programs:
- caches partial results as local scratch and then copies them to the official - caches partial results as local scratch and then copies them to the official
[LAGO repository](https://datahub.egi.eu) based on [LAGO repository](https://datahub.egi.eu) based on
[OneData](https://github.com/onedata); [OneData](https://github.com/onedata);
- makes standardised metadata for every inputs and results and includes them as extended attributes in OneData filesystem. - makes standardised metadata for every inputs and results and includes them
as extended attributes in OneData filesystem.
2. **``do_showers_onedata.py``** that: 2. **``do_showers_onedata.py``** that:
- executes analysis as ``do_showers.sh`` does. - executes analysis as ``do_showers.sh`` does.
- caches the selected simulation to be analisyed in local from the official [LAGO repository](https://datahub.egi.eu) and then stores again the results to the repository; - caches the selected simulation to be analisyed in local from the official
- makes also standardised metadata for these results and updates the corresponding catalog on OneData. [LAGO repository](https://datahub.egi.eu) and then stores again the results
to the repository;
- makes also standardised metadata for these results and updates the
corresponding catalog on OneData.
Storing results on the official repository with standardised metadata enables: Storing results on the official repository with standardised metadata enables:
- sharing results with other LAGO members; - sharing results with other LAGO members;
- future searches and publishing through institutional/goverment catalog providers and virtual observatories such as the [B2FIND](https://b2find.eudat.eu/group/lago); - future searches and publishing through institutional/goverment catalog
- properly citing scientific data and diseminating results through internet through Handle.net' PiDs; providers and virtual observatories such as the
- building new results based on data minig or big data techniques thanks to linked metadata. [B2FIND](https://b2find.eudat.eu/group/lago);
- properly citing scientific data and diseminating results through internet
through Handle.net' PiDs;
- building new results based on data minig or big data techniques thanks to
linked metadata.
Therefore, we encourage LAGO researchers to use these programs for their simulations. Therefore, we encourage LAGO researchers to use these programs for their
simulations.
## Pre-requisites ## Pre-requisites
1. Be acredited in [LAGO Virtual Organisation](https://lagoproject.github.io/DMP/docs/howtos/how_to_join_LAGO_VO/) to obtain a OneData personal [token.](https://lagoproject.github.io/DMP/docs/howtos/how_to_login_into_OneData/). 1. Be acredited in
[LAGO Virtual Organisation](https://lagoproject.github.io/DMP/docs/howtos/how_to_join_LAGO_VO/)
to obtain a OneData personal
[token.](https://lagoproject.github.io/DMP/docs/howtos/how_to_login_into_OneData/).
2. Had [Docker](https://www.docker.com/) (or [Singularity](https://singularity.lbl.gov/) or [udocker](https://pypi.org/project/udocker/)) installed on your PC (or HPC/HTC facility). 2. Had [Docker](https://www.docker.com/)
(or [Singularity](https://singularity.lbl.gov/)
or [udocker](https://pypi.org/project/udocker/))
installed on your PC (or HPC/HTC facility).
It is only needed [Docker Engine](https://docs.docker.com/engine/install/) to run onedataSim container, this is, the *SERVER* mode. However, the *DESKTOP* mode is the only available for Windows and MacOs, it includes the Docker Engine but also more functionalities. It is only needed [Docker Engine](https://docs.docker.com/engine/install/) to
run onedataSim container, this is, the *SERVER* mode. However, the *DESKTOP*
mode is the only available for Windows and MacOs, it includes the Docker
Engine but also more functionalities.
On linux, the recommended way is to remove all docker packages included by default in your distro and to make use of Docker repositories. On linux, the recommended way is to remove all docker packages included by
default in your distro and to make use of Docker repositories.
For example, for a old Debian based distribution such as Ubuntu: For example, for a old Debian based distribution such as Ubuntu:
```sh ```sh
...@@ -110,13 +129,25 @@ On CentOS 7 with root: ...@@ -110,13 +129,25 @@ On CentOS 7 with root:
## Downloading the official docker images to run onedataSim ## Downloading the official docker images to run onedataSim
onedataSim, ARTI and required software (CORSIKA, GEANT4, ROOT) are built, tested and packed into Docker images, following a in a [CI/CD fashion](https://jenkins.eosc-synergy.eu/job/eosc-synergy-org/job/onedataSim/). When whole steps of the CI/CD pipeline are sucesfully completed, the images are certified and stored in the space of LAGO Collaboration at [Docker-Hub](https://hub.docker.com/u/lagocollaboration). The process assures the quality of the software published. onedataSim, ARTI and required software (CORSIKA, GEANT4, ROOT) are built,
tested and packed into Docker images, following a in a
Depending on the type of data that you want generate and/or processs (i.e. [S0, S1, S2](https://lagoproject.github.io/DMP/DMP/#types-and-formats-of-generatedcollected-data)), you should pull different image, because their size. [CI/CD fashion](https://jenkins.eosc-synergy.eu/job/eosc-synergy-org/job/onedataSim/).
When whole steps of the CI/CD pipeline are sucesfully completed, the images
- **``onedatasim-s0``** is mainly for generate S0 datasets (simulations with ``do_sims_onedata.py``), but also allows S1 analysis. Therefore it includes the modified CORSIKA for LAGO, which it results in a heavy image (~911.7 MB). are certified and stored in the space of LAGO Collaboration at
- **``onedatasim-s1``** is only for generate S1 datasets (analysis with ``do_showers_onedata.py``), but the image is smaller (currently, ~473.29 MB). [Docker-Hub](https://hub.docker.com/u/lagocollaboration).
- ( Future: ``onedatasim-s2`` will be mainly for generate S2 datasets (detector response). It will include GEANt4/ROOT, and consequently, heaviest (~ 1GB)). The process assures the quality of the software published.
Depending on the type of data that you want generate and/or processs (i.e.
[S0, S1, S2](https://lagoproject.github.io/DMP/DMP/#types-and-formats-of-generatedcollected-data)),
you should pull different image, because their size.
- **``onedatasim-s0``** is mainly for generate S0 datasets (simulations with
``do_sims_onedata.py``), but also allows S1 analysis. Therefore it includes
the modified CORSIKA for LAGO, which it results in a heavy image (~911.7 MB).
- **``onedatasim-s1``** is only for generate S1 datasets (analysis with
``do_showers_onedata.py``), but the image is smaller (currently, ~473.29 MB).
- ( Future: ``onedatasim-s2`` will be mainly for generate S2 datasets (detector
response). It will include GEANt4/ROOT, and consequently, heaviest (~ 1GB)).
```sh ```sh
sudo docker pull lagocollaboration/onedatasim-s0:dev sudo docker pull lagocollaboration/onedatasim-s0:dev
...@@ -126,13 +157,18 @@ sudo docker pull lagocollaboration/onedatasim-s0:dev ...@@ -126,13 +157,18 @@ sudo docker pull lagocollaboration/onedatasim-s0:dev
sudo docker pull lagocollaboration/onedatasim-s1:dev sudo docker pull lagocollaboration/onedatasim-s1:dev
``` ```
(Currently for our DockerHub space, downloads are limited to 100/day per IP. If you are many nodes under a NAT, you should consider distributing internally the docker image through ``docker save`` and ``load commands``). (Currently for our DockerHub space, downloads are limited to 100/day per IP.
If you are many nodes under a NAT, you should consider distributing internally
the docker image through ``docker save`` and ``load commands``).
## Executing a stardandised simulation & analisys to be stored in OneData repositories for LAGO ## Executing a stardandised simulation & analisys to be stored in OneData repositories for LAGO
This automatised execution is the preferred one in LAGO collaboration. This automatised execution is the preferred one in LAGO collaboration.
You can execute ``do_sims_onedata.py`` or ``do_showers_onedata.py`` in a single command, without the needed of log into the container. If there is a lack of paramenters, it prompts you for them, if not this starts and the current progress is shown while the results are automatically stored in OneData. You can execute ``do_sims_onedata.py`` or ``do_showers_onedata.py`` in a single
command, without the needed of log into the container. If there is a lack of
paramenters, it prompts you for them, if not this starts and the current
progress is shown while the results are automatically stored in OneData.
```sh ```sh
export TOKEN="<personal OneData token (oneclient enabled)>" export TOKEN="<personal OneData token (oneclient enabled)>"
...@@ -170,7 +206,9 @@ sudo docker run --privileged -e ONECLIENT_ACCESS_TOKEN=$TOKEN \ ...@@ -170,7 +206,9 @@ sudo docker run --privileged -e ONECLIENT_ACCESS_TOKEN=$TOKEN \
3. Executing on a multi-processor server. 3. Executing on a multi-processor server.
If you count on an standalone server for computing or a virtual machine instantiated with enough procesors memory and disk, you only need add the **-j \<procs\>** param to enable multi-processing: If you count on an standalone server for computing or a virtual machine
instantiated with enough procesors memory and disk, you only need add the
**-j \<procs\>** param to enable multi-processing:
```sh ```sh
sudo docker run --privileged -e ONECLIENT_ACCESS_TOKEN=$TOKEN \ sudo docker run --privileged -e ONECLIENT_ACCESS_TOKEN=$TOKEN \
...@@ -207,7 +245,9 @@ sudo docker run --privileged -e ONECLIENT_ACCESS_TOKEN=$TOKEN \ ...@@ -207,7 +245,9 @@ sudo docker run --privileged -e ONECLIENT_ACCESS_TOKEN=$TOKEN \
### Executing on HTC clusters ### Executing on HTC clusters
If you has enough permissions (sudo) to run Docker in privileged mode on a cluster and get the computing nodes in exclusive mode, you can run many simulations at time. If you has enough permissions (sudo) to run Docker in privileged mode on a
cluster and get the computing nodes in exclusive mode, you can run many
simulations at time.
For example on the Slurm batch systems. For example on the Slurm batch systems.
...@@ -235,11 +275,17 @@ sudo docker run --privileged -e ONECLIENT_ACCESS_TOKEN=$TOKEN \ ...@@ -235,11 +275,17 @@ sudo docker run --privileged -e ONECLIENT_ACCESS_TOKEN=$TOKEN \
### Executing on clusters instantiated by oneself in IaaS cloud providers ### Executing on clusters instantiated by oneself in IaaS cloud providers
1. First you has to create and configure a cluster in the cloud: 1. First you has to create and configure a cluster in the cloud:
- Using the EOSC public cloud, that enables the pre-configuration of Slurm and other schedulers (Kubernetes). [See EOSC-Synergy training for LAGO](https://moodle.learn.eosc-synergy.eu/course/view.php?id=16) - Using the EOSC public cloud, that enables the pre-configuration of Slurm
and other schedulers (Kubernetes).
[See EOSC-Synergy training for LAGO](https://moodle.learn.eosc-synergy.eu/course/view.php?id=16)
- Using commercial public clouds (Amazon, Azure, Google, etc). - Using commercial public clouds (Amazon, Azure, Google, etc).
- Using private clouds (institutional orchestators as OpenStack, OpenNebula, XenServer, VMware, etc). - Using private clouds (institutional orchestators as OpenStack,
OpenNebula, XenServer, VMware, etc).
2. Example for an Slurm instantiated on EOSC resources (pre-configured by IM): 2. Example for an Slurm instantiated on EOSC resources (pre-configured by IM):
You can access to head node through SSH, using ``cloudadm`` account, but then you can gain root privileges with ``sudo``. Slurm and a directory shared by NFS are already configured (/home), but some configruation has to be done: to share the users' directories and to install spackages needed for Docker: You can access to head node through SSH, using ``cloudadm`` account, but then
you can gain root privileges with ``sudo``. Slurm and a directory shared by NFS
are already configured (/home), but some configruation has to be done: to share
the users' directories and to install spackages needed for Docker:
```sh ```sh
sudo mkdir /home/cloudadm sudo mkdir /home/cloudadm
...@@ -248,7 +294,8 @@ sudo docker pull lagocollaboration/onedatasim-s0:dev ...@@ -248,7 +294,8 @@ sudo docker pull lagocollaboration/onedatasim-s0:dev
sudo docker save -o /home/cloudadm/onedatasim-s0.tar lagocollaboration/onedatasim-s0:dev sudo docker save -o /home/cloudadm/onedatasim-s0.tar lagocollaboration/onedatasim-s0:dev
``` ```
Then, you can launch simulations through ``sbatch``. The environment varialbles will be exported to execution nodes. Thus: Then, you can launch simulations through ``sbatch``. The environment varialbles
will be exported to execution nodes. Thus:
```sh ```sh
export TOKEN="MDAxY...LAo" export TOKEN="MDAxY...LAo"
...@@ -257,7 +304,8 @@ cd /home/cloudadm ...@@ -257,7 +304,8 @@ cd /home/cloudadm
sbatch simulation.sbatch sbatch simulation.sbatch
``` ```
A simulation.sbatch file for testing functionality can be one that will write the allowed parameters in <job number>.log: A simulation.sbatch file for testing functionality can be one that will write
the allowed parameters in <job number>.log:
```sh ```sh
#!/bin/bash #!/bin/bash
...@@ -278,9 +326,15 @@ sudo docker run --privileged -e ONECLIENT_ACCESS_TOKEN=$TOKEN -e ONECLIENT_PROVI ...@@ -278,9 +326,15 @@ sudo docker run --privileged -e ONECLIENT_ACCESS_TOKEN=$TOKEN -e ONECLIENT_PROVI
### Building the onedataSim container ### Building the onedataSim container
Every container has different requrirements. To build the ``onedatasim-s0`` container is needed to provide as parameter an official ``lago-corsika`` image as base installation. This is so because ARTI simulations currently call [CORSIKA 7](https://www.ikp.kit.edu/corsika/79.php), which source code is licensed only for the internal use of LAGO collaborators. On the other hand, ``onedatasim-s2`` requires GEANT4/Root, and other official images must be used. Every container has different requrirements. To build the ``onedatasim-s0``
container is needed to provide as parameter an official ``lago-corsika`` image
as base installation. This is so because ARTI simulations currently call
[CORSIKA 7](https://www.ikp.kit.edu/corsika/79.php), which source code is
licensed only for the internal use of LAGO collaborators. On the other hand,
``onedatasim-s2`` requires GEANT4/Root, and other official images must be used.
On the other hand, other parameters allow choosing ARTI and onedataSim branches, which is fundamental for developing. On the other hand, other parameters allow choosing ARTI and onedataSim
branches, which is fundamental for developing.
#### Example: building images from default branches (currently "dev") #### Example: building images from default branches (currently "dev")
...@@ -300,7 +354,9 @@ sudo docker build --build-arg BASE_OS="lagocollaboration/geant4:TBD" \ ...@@ -300,7 +354,9 @@ sudo docker build --build-arg BASE_OS="lagocollaboration/geant4:TBD" \
#### Example: building ``onedatasim-s0`` from featured branches #### Example: building ``onedatasim-s0`` from featured branches
If you have the newer releases of *git* installed in your machine, you can build the container with one command. Note that afther the *.git* link, there hare an '#' followed of again the ONEDATASIM_BRANCH name. If you have the newer releases of *git* installed in your machine, you can build
the container with one command. Note that afther the *.git* link, there hare an
'#' followed of again the ONEDATASIM_BRANCH name.
```sh ```sh
sudo docker build --build-arg ONEDATASIM_BRANCH="dev-ajrubio-montero" \ sudo docker build --build-arg ONEDATASIM_BRANCH="dev-ajrubio-montero" \
...@@ -313,7 +369,10 @@ sudo docker build --build-arg ONEDATASIM_BRANCH="dev-ajrubio-montero" \ ...@@ -313,7 +369,10 @@ sudo docker build --build-arg ONEDATASIM_BRANCH="dev-ajrubio-montero" \
### Logging into container for developing purposes ### Logging into container for developing purposes
1. Runing scripts & attaching a local directory at login. 1. Runing scripts & attaching a local directory at login.
To log into the container only has to run bash without parameters, positioned alwasy at the end of the command. Additionally, You can mount a local directory inside the container the with the parameter **--volume \<local path\>:\<container path\>**. For example: To log into the container only has to run bash without parameters,
positioned alwasy at the end of the command. Additionally, You can mount
a local directory inside the container the with the parameter
**--volume \<local path\>:\<container path\>**. For example:
```sh ```sh
[pepe@mypc tmp]# ls /home/pepe/workspace [pepe@mypc tmp]# ls /home/pepe/workspace
onedataSim samples geant4-dev onedataSim samples geant4-dev
...@@ -368,9 +427,13 @@ drwxr-xr-x 1 1034995 638198 0 Sep 13 16:17 S0_sac_60_200.0_75600_QGSII_flat ...@@ -368,9 +427,13 @@ drwxr-xr-x 1 1034995 638198 0 Sep 13 16:17 S0_sac_60_200.0_75600_QGSII_flat
### Storing data on testing spaces based on OneData ### Storing data on testing spaces based on OneData
You can use testing spaces such as ``test8`` to store testing runs during development. For this purpose you should the suitable OneData provider and use the the ``--onedata_path`` parameter to select the correct path. You can use testing spaces such as ``test8`` to store testing runs during
development. For this purpose you should the suitable OneData provider and use
the the ``--onedata_path`` parameter to select the correct path.
For ``test8``, you should choose ceta-ciemat-**02**.datahub.egi.eu and any directory <dir> under the ``--onedata_path /mnt/datahub.egi.eu/test8/<dir>`` path: For ``test8``, you should choose ceta-ciemat-**02**.datahub.egi.eu and any
directory <dir> under the ``--onedata_path /mnt/datahub.egi.eu/test8/<dir>``
path:
```sh ```sh
export TOKEN="MDAxY...LAo" export TOKEN="MDAxY...LAo"
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment