eResearch Africa 2019 workshops

Monday, 15 April 2019

Workshop 1: Artificial intelligence and machine learning for data science development

Time: 09:00 - 16:30

Venue: Snape Building, Upper Campus, University of Cape Town
(View map)

Cost:

Conference daily rate: R1000
Full conference rate (this covers all four days of eResearch Africa): R3500

Registration: Please complete the eResearch Africa 2019 online registration form.

Presented by: Innocent Mamvura (University of Witwatersrand)

Innocent Mamvura is a Data Scientist at University of Witwatersrand with at least 9 years of experience in Artificial Intelligence and Machine Learning, Deep Learning, data extraction, preparation, analysis, modelling, data warehousing, and visualisations using Data Science technologies such as Python, KNIME and many more. He has facilitated a number of data science workshops and training sessions over the past years at some of the big gatherings in higher education. He is also involved in the Wits Business Intelligence Data Science programme as a course instructor in AI and Machine Learning.

Prerequisites: Basic knowledge of linear algebra, probability theory and programming languages. The workshop is self-contained and does not require any previous knowledge of Artificial Intelligence and Machine Learning. This workshop can only accommodate 20 participants.

Requirements: Participants are required to bring a laptop, and install Python and KNIME packages for data science.

Overview

Artificial Intelligence and Machine learning is transforming the higher education space through disruptive innovations such as adaptive learning, automated student support systems, virtual classroom reality, and many more. Huge increases in the costs of universities are starting to seriously turn off potential applicants. As enrolments have grown faster than the number of good high paying jobs, more students are “underemployed”, suggesting we are overinvested in higher education. With increasing costs there is a rise in the number of students who don't complete their degrees due to financial exclusions and dropping out of the system. Institutions of higher learning are now using disruptive technologies to better understand these challenges whilst at the same time there is a greater shortage of these skills in higher education especially in South African Higher Education.

The goal of the workshop is to empower delegates with the concepts and applications of machine learning and artificial intelligence to support social challenges in Africa. The workshop will introduce delegates to data preparation, modelling, evaluation and deployment of machine learning models. Delegates will learn the difference between supervised and unsupervised learning techniques and do practical lab sessions using selected datasets.

Learning Outcomes

After successfully completing this workshop, participants should be able to:

Explain the concepts of machine learning and artificial intelligence
Have an understanding of the strengths and weaknesses of machine learning techniques
Be able to design and implement various machine learning algorithms in a range of real-world applications

Outline

Data wrangling (120 minutes)
Data Wrangling is the process involved in transforming or preparing data for analysis. It is the set of actions that allows you to move from raw data to refined data. This session will cover topics such as Data Integration, Data Transformation, Data Pivoting and Data Cleaning
Unsupervised Learning (120 minutes)
The unsupervised session will cover the machine learning algorithms that uses information that is neither classified nor labelled such as the Kmeans Clustering algorithms
Supervised Learning (240 minutes)
The supervised learning session will cover the machine learning algorithms that use information that is classified or labeled and we will cover topics like Neural Networks, Decision Trees and Random Forests.

Target audience

Researchers
Data analysts
Statisticians
Information specialists
Citizen data scientists
Data scientists
Academic lecturers and many more

Monday, 15 April 2019 - Tuesday, 16 April 2019

Workshop 2: Software carpentry: teaching computing skills to researchers (Two-day workshop)

Time: 09:00 - 16:30

Venue: Snape Building, Upper Campus, University of Cape Town
(View map)

Cost:

Conference daily rate: R1000
Full conference rate (this covers all four days of eResearch Africa): R3500

Registration: Please complete the eResearch Africa 2019 online registration form.

Presented by: Adrianna PiÅska (University of Cape Town); Martin Dreyer (North-West University)

Adrianna PiÅska is a software developer working at UCT eResearch and IDIA. She has been involved with the Carpentries since 2014 and is an instructor.

Martin Dreyer is an IT systems analyst at NWU who specialises in eResearch and is extensively involved with the Carpentries as an instructor, trainer and workshop organiser.

Prerequisites: No previous experience with the tools is required. This workshop can only accommodate 40 participants.

Requirements: Participants are required to bring their own laptops, and if possible to install the software which will be used during the workshop beforehand (instructions for different operating systems will be provided). Installation assistance will also be available at the start of the workshop. Participants need not have any prior knowledge of the software tools or techniques taught in the workshop.

Overview

Software Carpentry is an international organisation which helps scientists and engineers to get more research done in less time and with less pain by teaching them basic lab skills for scientific computing. Its target audience is scientists, researchers, and research support personnel who have little to no prior computational experience, and its lessons are skills-specific, building on learners' existing knowledge to enable them to quickly apply skills learned to their own research after completing the workshop. The lesson material as well as software used during the workshops are available to anyone who would like to access it afterwards. Participants will be encouraged to help one another and to apply what they have learned to their own research problems.

In the past few years, local volunteers have run several successful Software Carpentry workshops here in South Africa, as well as many other African countries with an ever-growing local instructor base.

The workshop will take place over a two day period. During this time, participants will be taught the basics of the Unix Shell, whereafter this knowledge will help them to use version control to manage their work using Git. The next part of the workshop will teach participants how to program in Python, a high-level programming language with many scientific libraries which make it popular among scientists and academics.

The workshop will take place in a classroom setup with internet access for all participants.

Outline:

The Unix Shell: automating tasks and creating pipelines to process data on the commandline
Plotting and Programming in Python: an introduction to programming and data analysis in Python, a popular high-level language
Version Control with Git: an introduction to backing up work and collaborating with other researchers using Git and GitHub

Target audience

The workshop is aimed at anyone who would like to learn more about automating data processing: graduate students, faculty and other researchers.

We will be teaching beginner-level material suitable for programming novices.

Monday, 15 April 2019

NeDICC Workshops

These workshops are co-located with the eResearch Africa 2019 Conference.

Time: 08:30 - 16:30

Venue: Hlanganani Junction, Level 5, Chancellor Oppenheimer Library, University of Cape Town

Cost: R750

Registration: Please complete the NeDICC workshop online registration form

Enquiries: Please email Hanlie Baudin

Presented by: Wynand van der Walt (Rhodes University), Lucia Lötter, Hanlie Baudin and Qinisile Dlamini (Human Sciences Research Council)

Overview

Metadata and Linked Data

This session will explore lessons learnt in establishing descriptive metadata standards for repositories. Furthermore, the session will provide information about linked data elements within the metadata creation process. Although the current digital content management system approach will be used as the basis for the discussions for this workshop, the lessons and outcomes are intended to inform decisions about communities of practice for research data management systems and applications as well.

Data visualisation

The “Data Revolution” sparked a renewed interest in data visualisation as a tool to represent the essence of research evidence. Data stewards should therefore raise awareness about opportunities that this may entail, as well as provide guidance in best practices. Developing the skill to visualise data effectively also enables data stewards to communicate their own data-driven information in the best way. The workshop will kick-off with an introductory presentation: "The story behind the numbers

- An introduction to data visualisation". This will be followed by a work session where attendees will work with provided summarised data on the use of data sets and will be asked to create their own visualisations according to best practice principles.

Outline

08:00 - 08:30: Coffee/tea and registration
08:30 - 08:40: Welcoming
08:40 - 10:30: Workshop on Data Visualisation (1st session)
10:30 - 11:00: Coffee/tea and refreshments
11:00 - 12:30: Workshop on Data Visualisation (2nd session)
12:30 - 13:30: Lunch
13:30 - 15:00: Workshop on Metadata and Linked Data
15:00 - 15:30: Coffee/tea
15:30 - 16:30: Workshop on Metadata and Linked Data
16:30 - 16:40: Closing

Target audience

Researchers
Data analysts
Statisticians
Information specialists
Citizen data scientists
Data scientists
Academic lecturers and many more

Tuesday, 16 April 2019

Workshop three: Research data systems – help or hindrance in data sharing?

Time: 09:00 - 12:30

Venue: Snape Building, Upper Campus, University of Cape Town
(View map)

Cost:

Conference daily rate: R1000
Full conference rate (this covers all four days of eResearch Africa): R3500

Registration: Please complete the eResearch Africa 2019 online registration form.

Presented by: Dale Peters Mark Hahnel, Vicki Ricart

Requirements: N/A

Overview

Open Science is recognised as a cultural and behavioural phenomenon which requires a systemic shift in current research practice in South Africa. (1) The growing number of policies and mandates now require researchers to make their research outputs and supporting documentation (data management plans, articles, datasets, workflows, software) visible and reusable. However, the cumbersome manual workflow currently distributed across multiple research management systems in our universities and research organisations, frustrates the adoption of Open Science by imposing additional administrative burden on the research community and hinders the researchers’ focus on their core business: reproducible science. Countless third-party services have emerged to ease that workflow – or from the researchers’ perspective – to compound the confusion?

This workshop acknowledges the reluctance of researchers to engage with the current cumbersome workflow, and will explore together with vendors and service providers, potential systems integrations that will positively influence the agility of the ecosystem over the next 3-5 years.

The fragmented nature of Open Science practice in South Africa will be a major consideration of the workshop. Without clear practice guidelines for researchers, weakened support for national mandates without incentive funding and disparate systems and services across the scholarly infrastructure - the national research enterprise is at risk of failure to comply with performance assessment requirements of our research organisations and international funding agencies. This growing gap has been filled by commercial database providers who have gained a powerful role, beyond their control of scientific publishing, offering research workflow tools that increasingly shape the scholarly infrastructure (2), thereby defining research excellence without being challenged by local expert authority on the distinctiveness or social impact of research activity in open engagement with society.

JISC reports that to ensure compliance and good practice across the institution, HEIs rely increasingly on integrated research information systems to facilitate effective management of their entire Open Science portfolio (3). Academic institutions face the dilemma of outsourcing their scholarly infrastructure to improve efficiency of the scientific research process, at the risk of vendor lock-in. (4) This workshop will consider the technical and economic advantages in cloud-based infrastructure that scales across institutional boundaries, as in the national figshare model. It will examine alternative mechanisms for interoperability, considering the role of persistent identifiers (PIDs) in achieving loosely coupled discrete services for Open Science. Perhaps this integration be achieved in an high-level interoperability framework, such as that offered by the Scholix Framework (5), for exchanging information about the links between publications and the underlying data? The Freya project holds further promise in harmonizing existing PID services and delivering new PID service to meet researcher needs. (6)

Researchers will certainly benefit from having less administrative burden in an integrated scholarly infrastructure, but the ultimate objective must be a strategic vision for research infrastructure developments in the next 3-5 years; one that is sufficiently shared to drive scientific progress in South Africa and on the continent; and that is complemented support programmes that assist researchers in navigating the systemic shift ahead.

Recommended Reading

SA-EU Open Science Dialogue Report. (2018, 29 October). https://drive.google.com/drive/folders/0B6w8fGuczhXqMFpLbHVEQ2JPV1U
Posada, A., Chen, A. “Preliminary Findings: Rent-Seeking by Elsevier: Publishers are Increasingly in Control of Scholarly Infrastructure and Why We Should Care,” The Knowledge Gap: Geopolitics of Academic Production (blog), accessed 19 January 2019, http://bit.ly/2xKRnSr
Burland, T., (2018, January 16) The problem with just depositing research data to existing repositories. https://researchdata.jiscinvolve.org/wp/2018/01/16/the-value-of-the-jisc-rdss-whats-the-problem-with-depositing-into-the-existing-repositories/
Schonfeld, R. C., (2018, January 4). Big Deal: Should Universities Outsource More Core Research Infrastructure? https://doi.org/10.18665/sr.306032
Burton, A., Koers, H., Manghi, P., (et al). (2017). The Scholix Framework for Interoperability in Data-Literature Information Exchange. D-Lib Magazine, 23(1/2). http://www.dlib.org/dlib/january17/burton/01burton.html
https://www.project-freya.eu/en/about/mission

Workshop objectives

Identify the major players in the changing research ecosystem, and what they contribute towards the integration of siloed systems that are impeding the adoption of Open Science.
Learn from success stories in South Africa and on the African continent; what has been the uptake of Open Science systems and services; how do they support alternatives measures of research excellence for recognition and reward; and what have been the pitfalls?
Enable institutional decision makers (e.g. Research and Library Directors) to best to position themselves in the next 3-5 years, in developing effective and efficient research support services?

Outline

Introduction of key stakeholders, including software vendors, service providers, publishers and funders (45 minutes)
Development of the research infrastructure in the next 3-5 years (60 minutes)
Open Science case studies (45 minutes)
The way forward – a roadmap for research support services (45 minutes)

Target audience

Researchers
Research support staff from Libraries, IT and Research Offices
Libraries, IT and Research Office Directors
Funding agencies
Research system vendors and service providers
Publishers

Tuesday, 16 April 2019

Workshop 4: Large data transfer: can we help you move your large data sets?

Time: 09:00 - 12:30

Venue: Snape Building, Upper Campus, University of Cape Town
(View map)

Cost:

Conference daily rate: R1000
Full conference rate (this covers all four days of eResearch Africa): R3500

Registration: Please complete the eResearch Africa 2019 online registration form.

Presented by: Kasandra Pillay, Sakhi Hadebe, Johann Hugo, and Renier van Heerden (SANReN)

Kasandra Pillay is a currently the Group Leader (acting) of the Services Development and Incubation (SDI) team at SANReN. She is co-ordinating the Performance Enhancement and Response Team (PERT) initiative at SANReN which is currently focusing on the SANReN Data Transfer Pilot Service, helping South African researchers to move their datasets, between sites, faster. Her qualifications include MEng and BEng Hons (Technology Management), University of Pretoria, and BSc Engineering (Electronics), University of KwaZulu-Natal. She is also registered as a Professional Engineer (Pr. Eng.) with the Engineering Council of South Africa. Her interest area is Services Management in National Research and Education Networks.

Sakhi Hadebe has been an Engineer at the South African National Research Network (SANReN) Competency Area at the Council for Scientific and Industrial Research (CSIR) since 1 January 2013. He holds a Bachelor and Honours degree in Computer Science, both obtained from the University of Zululand. He started his career as an educator, teaching Mathematics and Computer Applications Technology. He worked for FNB as a Call Centre Specialist and then moved to India where he completed a one-year Infrastructure Management Services (IMS) training programme. While in India, he was also certified as a Microsoft Certified systems Engineer (MCSE 2003) and certified on ITIL v2. He also worked for SAAB Grintek Technologies as a support engineer.

Johann Hugo is a Networking specialist. He has experience with all kinds of networking services, routing, firewalls, Voip, IPv6 networking, FreeBSD unix systems, radius servers, outdoor wireless mesh networks, telecommunication networks, embedded hardware, etc. He was involved with the roll-out of the first SANReN links and since 2010 he is responsible for maintaining the National eduroam servers. He has helped several South African Universities to connect to eduroam and regularly assists them with debugging.

Dr Renier van Heerden is a principal researcher at Council for Scientific and Industrial Research (CSIR) in Pretoria, South Africa in the field of Cyber Security and currently Science Engagement (acting) at the SANReN CA. His key interests are password security, network attack and network ontologies. Prior to joining the CSIR, he worked as a software engineer in advanced optics applications for South African-based Denel Optronics and as a Lecturer at the University of Pretoria. Renier obtained a degree in Electronic Engineering as well as a Masters in Computer Engineering from the University of Pretoria. He has a PhD from Rhodes University.

Prerequisites: N/A

Requirements: Attendees, who need assistance with Data Transfer for a particular use case are required to complete the following survey: https://docs.google.com/forms/d/e/1FAIpQLSdQUqMYw1YcFP-wI-mYW_b-qyzrzi6s-ZWsWMeBkqedFITmgQ/viewform.

More information about the Data Transfer pilot visit https://www.sanren.ac.za/services/pert/data-transfer-pilot/.

Overview

The SANReN CA are looking for researchers/scientists/IT facilities of Research Institutes who need to transfer big data sets from/to one or more external sources to/from their institutions. These include both international and national sites. We want to help these users increase the speed and reliability of transferring their large datasets across their networks by helping their institution.

Moving masses of data is a challenge. In significant part, that’s because it can be slow and frustrating to transfer vast quantities of data from the many places it can be stored or generated over general-purpose computer networks.

When scientists attempt to run data intensive applications over campus networks, it often results in slow transfers - in many cases poor enough that the science mission is significantly impacted. Thus in the worst case, this means either not getting the data, getting it too late or resorting to alternative in inefficient measures such as shipping disks around.

The aim of this ½ day workshop is to help researchers to transfer large data sets by using the SANReN Data Transfer Pilot service or by setting up their own hardware and data transfer tools.

Outline

First topic: Networking challenges and Data Transfer Architecture: The Science DMZ Model (60 minutes)
Second topic: SANReN Data Transfer Pilot service Proof of Concept: Progress, Results, Next steps (45 minutes)
Third topic: Large Data Transfer tools: Globus, SCP, Filesender (30 minutes)
Fourth topic: Perfsonar and other bandwidth measuring tools (30 minutes)
Question and answers for an individual level (40 minutes)

Target audience

We are looking for users who want to move large data sets. Anyone with > 10 GB of research/science data, particularly those with current issues in moving it around (e.g.) .considers shipping hard-drives of data instead of using the SA NREN), are good
candidates.

Tuesday, 16 April 2019

Workshop 5: Figshare Fest Africa

Time: 13:30 - 16:30

Venue: Snape Building, Upper Campus, University of Cape Town
(View map)

Cost:

Conference daily rate: R1000
Full conference rate (this covers all four days of eResearch Africa): R3500

Registration: Please complete the eResearch Africa 2019 online registration form.

Presented by: Kayleigh Lino (Figshare)

Kayleigh Lino is a Technical Account Manager at Figshare, where she manages the implementation and general health of the software at institutions across the globe. She previously worked as a Digital Curation Officer at UCT Libraries, where she was involved in the development of Research Data Services at UCT. Having spent most of her professional career working with digital collections,
Kayleigh is passionate about curation in the context of digital repositories, and is currently engaged in a Master’s dissertation that focuses on the role of digital curation in academic libraries.

Prerequisites: N/A

Requirements: If you are an institutional Figshare client, bring along any thoughts on your experience with Figshare and any conversation starters you’d like to have with us and other institutional clients.

Overview

Sharing knowledge and best practice across communities, regardless of location and experience in the field of open research, is hugely important to us at Figshare.

Over the past 4 years, Figshare has hosted several Figshare Fests in different countries across the globe. These Fests are aimed at gathering local Figshare communities to collaborate and converse over mutual concerns, as well as introduce potential clients and
users to the Figshare community, giving them a better idea of who we are, what we do, and how the Figshare community is working together to improve the global state of open research.

This workshop will consist of a mix of talks and collaborative exercises to provide a better understanding of how Figshare has evolved from a data repository to a fully-fledged open research repository, as well as a specific focus on an ongoing project to improve the curation and review of non-traditional research outputs on Figshare.

Figshare believes that data review is a positive step in scientific scholarly communication. As such, we'd like to use this opportunity to identify the current challenges faced by the Figshare community in Africa in this regard. Feedback received on this topic during the
workshop will contribute to the global project outputs published openly on figshare.com, in order to reach a wider audience and make the process more community driven and as collaborative as possible, with an ultimate goal of shaping future software development at Figshare.

Proposed topics include:

Data curation & FAIR data repositories: shared approaches to improving trustworthiness & review of data
Building Figshare communities: support, shared resources & researcher engagement strategies in pursuit of best RDM practice
Figshare as an Institutional open research repository for papers, theses and/or educational and teaching materials (OERs)

Outline

Figshare update: New released features, roadmap plans (30 minutes)
Figshare as an IR: a focus on adaptation, migration, and integration (60 minutes)
Open Data Curation: towards a shared framework for the review of non-traditional research outputs (60 minutes)
Institutional use cases: client presentations on engagement and other user experiences (60 minutes)

Target audience

This event is open to anyone affiliated with an academic/research institution, you do not have to be a Figshare customer. Commercial companies will need to be considered by the organisers.

Tuesday, 16 April 2019

Workshop 6: Science communication workshop

Time: 13:30 - 16:30

Venue: Snape Building, Upper Campus, University of Cape Town
(View map)

Cost:

Conference daily rate: R1000
Full conference rate (this covers all four days of eResearch Africa): R3500

Registration: Please complete the eResearch Africa 2019 online registration form.

Presented by: Natalie Simon (UCT)

This workshop will be hosted primarily by Natalie Simon, from the Global Strategy and Visibility team in the UCT Research Office. Natalie has been working as a writer and science communicator in the UCT Research Office since November 2014; prior to that she worked as a freelance writer and journalist. She has experience in radio, print and digital journalism.

Prerequisites: N/A

Requirements: Come prepared with an example of a research publication, dataset or topic that you would like to tell a broader audience about.

Overview

Open publication, both of data and the published article, is of little consequence if no one is reading or reusing your work. Researchers who make their work public on blogs, social media and other forms of media get higher citations (Lamb, C.T., Gilber, S.L., Ford, A.T., 2018).

Effective science communication is a skill that can be taught. This workshop will comprise three components to equip researchers with the basic skills of science communication in a digital age:

Outline

Communicating in plain English: Often researchers are so immersed in their own fields, they no longer recognise the jargon in their writing. We will help you identify and lose the jargon and write in a way that is accessible to a wider audience, including funders and policymakers. (30 minutes)
Finding the news hook: If you don’t catch your audience’s attention in the first sentence you will lose them forever. This workshop will teach researchers how to identify what makes their work newsworthy and appealing to an audience outside of their field. (40 minutes)
Using social media effectively: Social media has its own rules and etiquette; understanding these is the first step towards building a strong following and increasing your reach. This workshop will include a crash course in how social media works, how the platforms differ and what practical strategies you can use to ensure the greatest possible reach. (We will discuss Twitter, Facebook and Instagram primarily, but happy to have an open discussion about other platforms if there is time and interest).(80 minutes)

Target audience

Researchers interested in building their science communication skills

Reference

[1] Lamb CT, Gilbert SL, Ford AT. 2018. Tweet success? Scientific communication correlates with increased citations in Ecology and Conservation. PeerJ 6:e4564 https://doi.org/10.7717/peerj.4564

Monday, 15 April 2019

Workshop 1: Artificial intelligence and machine learning for data science development

Monday, 15 April 2019 - Tuesday, 16 April 2019

Workshop 2: Software carpentry: teaching computing skills to researchers (Two-day workshop)

Monday, 15 April 2019

NeDICC Workshops

Tuesday, 16 April 2019

Workshop three: Research data systems – help or hindrance in data sharing?

Tuesday, 16 April 2019

Workshop 4: Large data transfer: can we help you move your large data sets?

Tuesday, 16 April 2019

Workshop 5: Figshare Fest Africa

Tuesday, 16 April 2019

Workshop 6: Science communication workshop

Our sponsors