The SKA and the Rise of Big Data in Radio Astronomy
The first decade of this century has seen a tremendous advance in information and digital technologies impacting scientific inquiry. Data being created by global projects in science and engineering, by the ubiquitous sensors tracking the state of the planet, by the connected internet of things, and by vast and complex collections of meta data that trace the patterns and trends in human behavior are beginning to be creatively mined in ways that fundamentally change our perception of the world and empower global change. Organizations that can rise to the big data challenge will be the leaders in this new era of research. The Square Kilometre Array project including SKA technology developments in South Africa is driving one of the most significant big data challenges of the coming decade. Instantaneous observing bandwidths have increased by two orders of magnitude. High dynamic range imaging demands, and the need to mitigate narrow-band radio frequency interference has driven observing programs to simultaneously sample vast numbers of frequency channels. Data rates to researchers are 103 - 104 times larger than typical only a few years ago and continue to grow exponentially. In response to this data transformation a new field of astroinformatics has emerged, defined as a set of “naturally-related specialties including data organization, data description, astronomical classification taxonomies, astronomical concept ontologies, data mining, visualization, and statistics”. Scientific leadership on the pathway to the SKA requires not simply access to facilities for processing and storage of data but innovation in computational approaches and algorithms, technologies for visualization and visual analytics of big data, and e-science tools for access and collaborative research around big data by regionally or globally distributed teams of researchers.
Professor Russ Taylor
Russ Taylor received a B.Sc in Astronomy, from the University of Western Ontario in 1976, and a Ph.D. in Physics (Radio Astronomy) from the University of British Columbia in 1982. He is currently South African Joint Research Chair in Radio Astronomy at the University of Cape Town and University of the Western Cape. Before coming to South Africa in 2014, Professor Taylor was Professor of Astrophysics at the University of Calgary and Director of the three-university Institute for Space Imaging Science. Past positions include: Head of the Department of Physics and Astronomy University of Calgary, Visiting Scientist, U.S. National Radio Astronomy Observatory; Distinguished Visiting Scientist, Australian Commonwealth Industrial Research Organization; Research Associate, University of Manchester, Jodrell Bank Observatory; Research Associate, University of Groningen, Kapteyn Astronomical Laboratory; NSERC Postdoctoral Fellow, University of Toronto.
Municipal Administrative Data: Big Data in Economics
A few large municipalities are increasingly the focus of service delivery for the majority of South Africa's population. In delivering services to their populations these municipalities have to balance two sorts of concerns:
- the welfare of the households within their municipal boundaries, and
- the fiscal sustainability of delivering services to those households.
In the process of delivering these services, municipalities generate large amounts of household level data daily. So far these have been under-utilised as a resource for research in the social sciences. This project, a collaboration between Grant Smith (UCT) and Kelsey Jack (Tufts University), has collected 10 years’ worth of water and electricity consumption data, payment records and property values at the household level. The project seeks to leverage that data for the City of Cape Town in order to assess a host of questions around household welfare and service delivery sustainability.
This presentation speaks to the particular challenges of using such big data in social sciences which are mainly of three sorts:
- protecting confidential data,
- creating and maintaining partnerships with the municipality's key departments, and
- integrating disparate data sets in order to produce a data set amenable to enquiry.
Grant Smith
Computational modelling in cellular and tissue BioMechanics
Scaffold-based engineering of biological tissues and organs in regenerative medicine requires the understanding of mechanical properties of the scaffolds as well as the effects of interplay of scaffolds and ingrowing cells and tissue on the transient structural properties of the engineered tissue. Computational models can facilitate the study of these mechanobiological aspects and guide the development of scaffolds for tissue regeneration. The present research focused on the development of computational models to investigate the mechanics of fibrous polymeric scaffolds and single cells.
Scaffolds with a high degree of fibre alignment were electro-spun from bio-stable Pellethane. Scaffold samples were placed in dorsal, subcutaneous pockets in a Wistar rat model. All animal experiments were approved by the institutional review board of the University of Cape Town and were performed in accordance with the National Institutes of Health (NIH, Bethesda, MD) guidelines. Samples were explanted after 7, 14 and 28 days and underwent uniaxial tensile testing in fibre and cross-fibre directions. Stress-strain data were utilized in finite element models to describe the constitutive properties of the scaffold with a transversely isotropic hyperelastic strain energy function.
One scaffold sample underwent structural assessment using micro-computed tomography (μCT). A three-dimensional geometry of the fibrous scaffold structure was reconstructed from the μCT data using image processing software ScanIP (Simpleware v6, Simpleware Ltd, Exeter, UK). This geometry facilitated the development of a finite element model (Abaqus, Dassault Systèmes Simulia Corp, Providence, RI, USA) for the prediction of the mechanics of the fibrous network during deformation of the scaffold.
Fibroblasts were cultured and stained for actin fibres and nuclei. Image stacks of cell populations were obtained with confocal microscopy. The three-dimensional geometry of the cytosol and the nucleus of a single fibroblast were reconstructed using ScanIP and the resulting finite element mesh was exported to Abaqus to study the cell-substrate mechanics and the effect of modelling focal adhesions on the predicted intracellular deformation.
Associate Professor Thomas Franz
Thomas Franz received an MSc and PhD degree in Mechanical Engineering from the Universities of Hannover and Bremen, Germany, in 1992 and 1998. With post-doctoral fellowships from the South African National Research Foundation and the Claude Leon Foundation, he conducted research in mechanics of materials at the Centre for Research in Computational and Applied Mechanics at the University of Cape Town between 1998 and 2001. Since 2002, his research focus changed to biomechanics in cardiovascular diseases and therapies and he led the Biomechanics and Mechanobiology Laboratory at UCT’s Chris Barnard Division of Cardiothoracic Surgery. In May 2014, he took up the position as Head of Division of Biomedical Engineering in the Department of Human Biology, Faculty of Health Sciences, UCT. Thomas is a Fellow of the Programme for the Enhancement of Research Capacity (PERC) at UCT’s Research Office, contributing toresearch development and promotion of interdisciplinary research.
Fibre orientation projection mapping for patient-specific computational modelling in cardiac diseases and therapies
Cardiovascular disease is the single leading cause of death in the world accounting for 30% of all human mortality (WHO 2011). Despite the recent advancements of pharmaceutical, surgical, device and tissue engineered therapy strategies, cardiovascular disease remains one of the most costly, common and deadly medical conditions. Additionally, projections show an increase in predicted mortality keeping cardiovascular disease as the leading cause of death globally (Mathers and Loncar 2006, WHO 2011).
With the advancement of high performance computing, and its accessibility, scientific models have the freedom to include more complex and realistic descriptions of the heart. There is currently a great deal of interest in using models as diagnostic or therapeutic aids. Reliable computational models have the potential to provide a richer source of information for clinical decision making, treatment, and the development of medical products. Predictive computational modelling would facilitate accurate diagnosis and patient-optimised treatment.
The description of fibre orientation is an essential component of any cardiac computational model. The fibre orientation in the heart has a critical influence on mechanical and electrical function. The physiological description of fibre orientation in the heart is a highly complex and sophisticated task, which has been the subject of significant historical disagreement (Gilbert, Benson et al. 2007, Buckberg, Mahajan et al. 2008). This stems from the complex multi scale branching and merging that occurs on a cellular and micro scale, creating anisotropy features that change dramatically throughout the structure. The description of the cardiac structure is therefore a three dimensional network problem.
An accurate portrayal of the fibre orientation in a computational model needs to incorporate not only the one dimensional fibre orientation tangent to the fibre direction, but also include a description of the laminar sheet, in which the fibre is embedded. It has been shown that cardiac tissue exhibits sheet-like behaviour (LeGrice, Hunter et al. 2001, Pope, Sands et al. 2008) which influences both passive and active material response.
To achieve this, a projection mapping algorithm is introduced which aggregates in vivo and ex vivo experimental data as a generic description and transfers this information onto realistic patient-specific geometries of the left ventricle in healthy and diseased states.
A suitability and sensitivity analysis is performed on computational models of the human left ventricle in the finite element framework.
Kevin Sack
Kevin Sack is a doctoral student at the Biomechanics and Mechanobiology Lab, University of Cape Town, South Africa. His research focuses on patient-specific modelling of cardiac problems within finite element computational models.
An eResearch case study in advanced facility support: UCT Aaron Klug Centre for Imaging and Analysis
Incorporating advanced imaging techniques, such as TEM, in research projects has always demanded significant resources, both in expensive infrastructure and in user training. With current technological advances driving rapid changes in data acquisition and analysis we are now also experiencing an inevitable increase in the rate and complexity of the data collected. If the promise of these developments is to be fully exploited in the quest to tackle more difficult scientific problems then modern imaging facilities have to respond to this deluge of users and data. The increasingly collaborative and interdisciplinary nature of research, and the interconnectedness of facilities in national distributed centres of excellence also demands innovative tools for archiving, sharing, and publishing data.
UCT’s eResearch Centre is currently partnering with the new Aaron Klug Centre for Imaging and Analysis to address these challenges and offer world-class imaging services for fields ranging from catalysis to medical microbiology. Through the provisioning of storage, advanced computation, and data management, eResearch aims to enable the research agendas of every user of the facility. In this talk, we will present the commissioning of the automated TEM acquisition and image processing systems Leginon and Appion, including our implementation of the underlying hardware and database server architectures. These tools have enhanced workflows in single-particle cryo-EM, not only enhancing the quality and management of data but also lowering the barriers to access. Future plans for overarching user, data, and facility management will also be discussed.
Dr Jason van Rooyen
Going Full Circle: Research Data Management @ University of Pretoria
The second part of the presentation will focus on the projects and technology (software and hardware) used. The University of Pretoria has adopted an Enterprise Content Management (ECM) approach to manage its Research Data. ECM is not a singular platform or system but rather a set of strategies, tools and methodologies that interoperate with each other to create a comprehensive management tool. These sets create an all-encompassing process addressing document, web, records and digital asset management. At the University of Pretoria we address all these processes with different software suites and tools to create a complete management system. Each process presented its own technical challenges. These had to be addressed, while keeping in mind the end objective of supporting researchers throughout the whole research process and data life cycle. Various platforms and standards have been adopted to meet the University of Pretoria’s criteria. To date three processes have been addressed namely, the capturing of data during the research process, the dissemination of data and the preservation of data.
Johann van Wyk
Johann van Wyk is currently the Assistant Director: Research Data Management at the Department of Library Services, University of Pretoria, responsible for the development of a Research Data Management programme at the University of Pretoria. He completed his Master’s Degree in 2005 at UP with a thesis titled “Communities of Practice: an important element in the knowledge management practices of an academic library as learning organisation”. He is currently busy with his PhD degree, which focuses on “Research Data Management as an essential component within Virtual Research Environments”. He presented several papers on knowledge management, Communities of Practice and Web 2.0, and more recently on Research Data Management.
Isak van der Walt
Isak van der Walt has been part of the University of Pretoria since 1999. Forming part of the user support team up to 2008. From there he became Manager of Information Systems for a campus-affiliated company focusing on market research. In 2010 he was appointed as Senior Network Analyst and later as Senior Systems Analyst for the Library Services at the University of Pretoria. He specializes in the development of new systems according the organisational goals and objectives and uses experience from various Information Technology domains to meet strategic needs.
He is also a Technical Specialist assisting the CSIR in developing and maintaining a multinational Virtual Research Environment. Isak is passionate about the technical challenges arising from data management and importance thereof for Academic institutions. He holds various IT related qualifications acquired throughout the course of his career.
From Process to Practice: Establishing a Research Data Management Function in a Resources Constrained Environment
As a first step we decided to actively participate in the establishment and maintenance of our community of practice (CoP) the Network of Data and Information Curation Communities (NeDICC) - so that we could be surrounded and supported by colleagues confronting the same challenges.
The first hurdle that we needed to face was the fact that we were severely resource constrained. We could not initially convince management that we needed a new post and we therefore had to repurpose an existing position before we could really start making any progress. We also made a conscious strategic decision to include RDM in the CSIR’s records management drive - so that we did not need to address the issue in isolation. Next we completed a situation analysis which allowed us to point out pockets of RDM excellence in our organisation. We will share some of the results of this research and plan to also explain how the CSIR made use of the CARDIO model to influence short term targets.
We would like to acknowledge the role that NeDICC continues to play in our process. We will briefly look at experiential learning as a vehicle for knowledge transfer. The differences in practice amongst our CoP partners will also be mentioned briefly. We intend to explain how the different requirements and demands impact on the approach taken and the route followed when establishing a RDM function but that the learning could still be transferred. NeDICC’s role in the UP based Carnegie Professional Development Programme will, for example, be explained and some of the learning gained from that will be highlighted.
In conclusion we will briefly explain how we plan to break down the obstacles we expect to encounter on our journey up our Table Mountain.
Adele van der Merwe
Adèle is the Records Management and Archival team leader of the CSIR’s Information Services (CSIRIS). She obtained her master’s degree on the development and implementation of an institutional repository within a science, engineering and technology environment cum laude. The core function of her job is to ensure that the CSIR’s organizational heritage is well managed and remains accessible to the CSIR and the international community where applicable. Her responsibilities therefore also include looking for and testing possible solutions for the improved management, organization’s research artefacts prior to making recommendations. She is responsible for monitoring metadata standards and new trends and tools in the area of knowledge dissemination.
Louise Patterton
Louise is a research data librarian at the Council for Scientific and Industrial Research. She was previously employed at the Agricultural Research Council and has been functioning in various library positions during the past 24 years. Academic qualifications include a master’s degree in Human Movement Science, as well as currently working on master’s degree studies in Information Science, where she is investigating the research data management behavior of emerging researchers.
Martie van Deventer
Martie's position is that of Portfolio Manager in the CSIR’s Information Services (CSIRIS). The core functions of her job are to ensure that the CSIR research staff are enabled in their quest to generate knowledge (by having access to peer reviewed, high quality information resources) and to guide the development institutional memory systems and tools (systems that would address the CSIR’s need to disseminate its knowledge and information within the internal and the open access environments and tools (mainly policies and procedures that would guide the activities). Aligned to international trends CSIRIS is currently in the process of investigating and implementing services and products to address the changing landscape brought about by eResearch and the related technologies. Building the supporting infrastructure for virtual research environments, developing the principles for sound digital preservation practices and skills development are focal areas of interest.
Jointly Exploiting Data Infrastructure to improve the impact of African Social Science – the APHRC case
CHAIN-REDS (www.chain-project.eu) is a project co-funded by the European Commission within its Seventh Framework Program with a consortium made of ten renowned organisations in the field of e-Infrastructures, representing Europe and most of the world regions (Africa, Asia, Latin America and the Caribbean, and the Middle-Eastern and Gulf Region). Its vision is to promote and support technological and scientific collaboration across different e-Infrastructures established and operated in various continents in order to facilitate their uptake and use by established and emerging virtual research communities (VRCs), but also by single researchers, promoting instruments and practices that can facilitate their inclusion in the global e-Science and e-Research.
APHRC is non-governmental research institute in the social sciences domain. It collects data sets of high social importance, especially in the Sub-Saharan region, relating to health and wellbeing, social and economic indicators, food security, etc. This data is collected in a rigorous way, and is often used to prepare policy statements and responses to societal challenges in Nairobi, Kenya and beyond. Since it has such great importance and the ramifications of the analysis and interpretations of this data go deep, it is understandable that it should be held to account. The data should be accessible, understandable, and reliable, if possible, permanently.
In this paper, we describe work done in a collaboration between CHAIN-REDS and APHRC, to provide persistent identifiers to the APHRC microdata portal, and improve the impact and discoverability of the Centre's datasets. Some discussion will be included to address the implications of this work for social science data repositories in Africa, including aspects of repository metadata harvesting and aggregation services and interoperability between metadata standards.
Dr Bruce Becker
Bruce Becker is a senior researcher at the SANREN Competency Area of the CSIR Meraka Institute. He holds a Ph.D. from the University of Cape Town and has worked at the CEA (Paris) and INFN (Cagliari) on the ALICE experiment at the LHC. After moving back to South Africa, he kickstarted the South African National Grid, a federation of institutes, national laboratories and research groups providing an integrated computational and data infrastructure. He continues the coordiantion of SAGrid, and at the regional level, the Africa-Arabia Regional Operations Centre. His roles in this context are to ensure smooth technical interoperability between resource centres in the region, promote the deployment of new sites, provide technical and operational support and promote the uptake of services associated with the ROC. Working closely with the Ubuntunet Alliance, he has been involved in several FP7-funded support actions, including ei4Africa, CHAIN and CHAIN-REDS projects. He works closely with SANReN in the area of identity federations, network-intensive applications and other advanced services.
Cheick Faye
Cheick Faye is a Senior Researcher at the APHRC. Prior to joining APHRC, Cheikh worked as a Program Officer at Agence Pour la Promotion des Activites de Population (APAPS) – Senegal, where he conducted several evaluation researches on HIV/AIDS, family planning, maternal and child health. He also worked as a data analyst for IntraHealth International and a consultant for Population Council and World Food Program. Cheikh has more than ten years’ experience working on designing, implementing and evaluating research projects in Senegal and other countries in West Africa. He holds a Master’s degree in Statistics (2001) from the Houari Boumediene University of Sciences and Technology (Algiers, Algeria). Cheikh’s experience includes research on reproductive health (behavioral surveys on HIV/STI in Senegal and Mauritania, post-abortion care, family planning), local development and poverty.
Paul Odero
Paul Odero is the IT manager of the APHRC. He has over 7 years’ experience in Web Systems Development and holds a Bachelor of Science degree in Information Technology from Jomo Kenyatta University of Agriculture and Technology.
He also holds a Bachelor of Science diploma in Computer Science from Kenyatta University. Paul joined APHRC in 2012 prior to which he with Dew CIS Solutions Limited from 2005 to 2012 as the Web and Graphic Developer. He has also developed and implemented the intranet and several systems for the Communications Commission of Kenya (CCK) such as the Customer Relationship Management system, Enterprise Resource Planning system, Document Management system, and IT Helpdesk with comprehensive training of all users in CCK. Paul is driven by perfection.
Dealing with the 3 deadly data sins and other delinquencies
Elias Makonko
Mr Elias Makonko is a Research Data Curator in the Data Curation unit of the Research Methodology and Data Centre. He holds a BSc (Hons) in Statistics from the University of Limpopo and a BSc in Mathematics and Statistics from the University of Pretoria.
Before joining the HSRC in his current capacity, he worked for Link Community Development as a Research Methods Specialist (M&E), for the Human Sciences Research Council as a Junior Researcher in HAST, and for Statistics South Africa as an Assistant Statistical Officer.
As a Research Data Curator, he prepares research data for preservation and dissemination by validating and checking deposited data and related documentation, reviewing and completing structured metadata, developing unstructured metadata to describe datasets, anonymising data for dissemination purposes and preparing preservation and dissemination files.
Mr Makonko also assists and trains research staff to manage research data according to best practice to support the preservation and dissemination of data, as well as facilitates the secondary use of data. Furthermore, he participates in capacity building activities to build capacity in the area of research data curation in the wider South African curation community.
ORCID and DataCite Interoperability Network
Josh Brown
ORCID Regional Director,
Europe
Whose data is it anyway? Identifying the values and incentives that encourage the deposit and sharing of research data
Many funding agencies require, as a prerequisite of the approval of grant funding, the submission of data management plans and the deposit of data into trusted digital archives on completion of the research project. This requirement speaks both to the need for increased social development resulting from research findings, and a greater return on investment in the research endeavour.
Data sharing practice is emerging in specific research communities, offering valuable insights into the values and incentives that challenge the alignment of emerging infrastructure services with the compliance and reward mechanisms of such communities. This paper will investigate the role of funder mandates and user perceptions of trust in infrastructure management as a measure of probability of data deposit and sharing.
However, data sharing means using, reusing, or merging different data into new data products. The concern of researchers cannot be ignored, in the possible misusing data, unintentionally or intentionally. While it is essential to foster data exploitation, several key issues related to open access will be addressed from ethical, legal and technical perspectives.
Dr Dale Peters
Developing an institutional research data management plan: Guidelines for universities
University of Pretoria
Dr Heila Pienaar is Deputy Director: Innovation and Technology in the Department of Library Services at the University of Pretoria. Her research focuses on e-information strategy formulation, digital library implementation, virtual research environments and research data management.
http://www.slideshare.net/heila1/dr-heila-pienaar-cv.
Integrating data management planning into institutional workflows – tools and challenges
With the assistance of DCC or any one of the other international institutions provide this one hour practical session will focus on the introduction and use of research data plans (including the use of the online tool).
Joy Davidson
Science set free: scientific output and open science policies for Europe and beyond
Presentation - Science set free: open science policies for Europe
Presentation - ZENODO Research. Shared
Najla Rettberg and Lars Holm Nielsen (CERN)
Najla Rettberg is the scientific manager of OpenAIREplus. She is based at the University of Göttingen in the Electronic Publications Unit of the library. Her background is in scholarly communication, open access and digital preservation, and has worked in a variety of library-based organisations around Europe.
Building the eResearch Data-scopes for new Australian Centres of Research Excellence
We highlight how Monash University's eResearch program is supporting the strategy to create a world-class research environment to underpin the research of two new ARC Centres of Excellence:
- ARC Centre of Excellence in Advanced Molecular Imaging
- ARC Centre of Excellence for Integrative Brain Function
Such environments require the orchestration of specialised instruments, data storage and processing facilities, and advanced data visualisation environments. The Clayton Innovation Precinct is now home to a world-unique trifecta to support this vision:
- Advanced scientific instruments located at Monash University, CSIRO, Australian Synchrotron and affiliated medical research institutes;
- Unique data processing capabilities of the Multi-modal Australian ScienceS Imaging and Visualisation Environment (MASSIVE) HPC facility; and
- A world-class immersive visualisation environment for data analysis and collaboration (the CAVE2).
The way in which scientists apply these three capabilities in concert will be an archetype of the way research will be performed in the 21st century.
In this talk we outline the the institutional support structures (the “Talent"), the advanced computing infrastructure (the “Technology") and the agile project processes and management approaches (the “Tolerance" for risk and ambiguity) that have created a solid platform to enable the very best research to be conducted at the Centres of Excellence.
Prof C. Paul Bonnington
Prof Paul is the Director of the Monash e-Research Centre, Monash University, and a Professor in the School of Mathematical Sciences. Prof Paul Bonnington is a member of the Go8 Digital Futures group, and the steering committees for the Victorian Life Sciences Computing Initiative (VLSCI) and National Computational Infrastructure’s Specialist Facility for Imaging and Visualisation (MASSIVE). Paul is also a member of CSIRO’s e-Research Council. He recently served as the Chair of the Steering Committee for the Australian National Data Service Establishment Project. The Australian National Data Service is an initiative of the Australian Government begun under the National Collaborative Research Infrastructure Strategy. Since its development, it has executed plans to develop Australia’s research data infrastructure, capture descriptions of Australia’s research data, and to build Australia’s research data management capability. The Monash e-Research Centre’s (MeRC) role is to build collaborations between research disciplines, nurture e-Research developments and to build bridges between researchers and service providers. The Centre is an initiative of Monash University’s Deputy-Vice Chancellor (Research) to support researchers, by harnessing the resources and capacities of the IT Services Division, the University Library and computer scientists in the Faculty of Information Technology to enhance research capability.