VAMP: Virtual Analysis and Data Management Platform
Concept of the VAMP platform for data analysis
Interactive analysis of imaging datasets using the VAMP platform
Microscopy data is a major driver of data growth at the Biozentrum due to the simultaneous advent of new imaging techniques and improvements in the automation of image acquisition. These very exciting developments allow the detailed analysis of phenomena on the nanometer scale, as well as the study of biological processes in whole organs or individuals over time. Nonetheless, such studies generate large data amounts which typically are inspected and analyzed in an interactive fashion. This analysis is made challenging by a number of issues:
- Data sets are large, their analysis is CPU- and I/O intensive. Staging data sets to/from local storage can take as long as the analysis itself.
- Analysis workflows are variable and often depend on visual inspection and interaction by the user. Software in common use is written with this usage pattern in mind. Therefore, massive parallelization on a Linux HPC cluster - a standard approach to make large-data analysis tractable - is rarely a practical option.
- Finally, data analysis workflows are often depending on manual intervention, including tedious copy steps. This is prone to lead to errors and to data duplication.
In collaboration with the Imaging Core Facility, we provide virtualized analysis workstations (VAMP: Virtual Analysis and Data Management Platform) with which our scientists can obtain the required required storage- and computing capabilities on-demand.
Key advantages of this system are:
- Flexibility: Provide a number of different virtual workstation types (hardware, provided software)
- Accessibility: Available from everywhere within Uni Basel VPN. Disconnecting/Reconnecting a session is possible
- Processing power (CPU+GPU) resides on a powerful server, the client only displays a view of the data. This central resource is shared, leading to better utilization.
- Routine data workflows can be automated to simplify use and prevent errors.
- Resource management automation through reporting and management scripts
- Dedicated and fast storage with automatic reminders and cleanup.
The management of dataset lifecycles is greatly facilitated by the use of the VAMP platform and its designated analysis storage area.