This workshop at the National Center for Ecological Analysis and Synthesis (NCEAS) will provide researchers with an introduction to advanced topics in computationally reproducible research in Python, including software and techniques for working with very large datasets. This includes working in cloud computing environments, docker containers, and parallel processing using tools like parsl and dask.
The workshop will also cover concrete methods for documenting and uploading data to the Arctic Data Center, advanced approaches to tracking data provenance, responsible research and data management practices including data sovereignty and the CARE principles, and ethical concerns with data-intensive modeling and analysis.
Topics will include:
- Scalable computing,
- Cloud computing concepts,
- Docker environments,
- Remote computing,
- Parallel processing and concurrency,
- Large data transfer and data staging,
- Data extraction, and
- I/O efficiency.
This course is intended for those who need to take their skills to the next level to maximize efficiency working with big datasets or running computing-intensive processes.
Support to travel and/or lodging may be available.
Application deadline: 22 December 2023
For more information, go to: Scalable and Computationally Reproducible Approaches to Arctic Research Application 2024 (google.com)
For questions, contact:
Angie Garcia
Email: agarcia@nceas.ucsb.edu