JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for BIODIVERSITY Archives


BIODIVERSITY Archives

BIODIVERSITY Archives


BIODIVERSITY@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

BIODIVERSITY Home

BIODIVERSITY Home

BIODIVERSITY  April 2020

BIODIVERSITY April 2020

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

ONLINE COURSE – Python for data science, machine learning, and scientific computing

From:

Oliver Hooker <[log in to unmask]>

Reply-To:

European Commissions Biodiversity Research Program <[log in to unmask]>

Date:

Thu, 23 Apr 2020 21:37:13 +0100

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (95 lines)

ONLINE COURSE - Python for data science, machine learning, and scientific computing (PDMS02) This course will be delivered live

https://www.psstatistics.com/course/python-for-data-science-machine-learning-and-scientific-computing-pdms02/

This course will be delivered via video link from the 4th-8th May

This is a ‘LIVE COURSE’ – the instructor will be delivering lectures and coaching attendees through the accompanying computer practical’s via video link, a good internet connection is essential.

TIME ZONE – Western European Time – however all sessions will be recorded and made available allowing attendees from different time zones to follow a day behind with an additional 1/2 days support after the official course finish date (please email [log in to unmask] for full details or to discuss how we can accommodate you).



Course Overview:
Python is one of the most widely used and highly valued programming languages in the world, and is especially widely used in data science, machine learning, and in other scientific computing applications. This course provides both a general introduction to programming with Python and a comprehensive introduction to using Python for data science, machine learning, and scientific computing. The major topics that we will cover include the following: the fundamentals of general purpose programming in Python; using Jupyter notebooks as a reproducible interactive Python programming environment; numerical computing using numpy; data processing and manipulations using pandas; data visualization using matplotlib, seaborn, ggplot, bokeh, altair, etc; symbolic mathematics using sympy; data science and machine learning using scikit-learn, keras, and tensorflow; Bayesian modelling using PyMC3 and PyStan; high performance computing with Cython, Numba, IPyParallel, Dask. Overall, this course aims to provide a solid introduction to Python generally as a programming language, and to its principal tools for doing data science, machine learning, and scientific computing. (Note that this course will focus on Python 3 exclusively given that Python 2 has now reached it end of life).

Monday 4th – Classes from 09:30 to 17:30
• Topic 1: The What and Why of Python. In order to provide some general background and context, we will describe Python where came from, what its major design principles and intended use was originally, and where and how it is now currently used. We will see that Python is now extremely widely used, especially in powering the web, in data science and machine learning, and system level programming. Here, we also compare and contrast Python and R, given that both are extremely widely used in data science.

• Topic 2: Installing and setting up Python. There are many ways to write and execute code in Python. Which to use depends on personal preference and the type of programming that is being done. Here, we will explore some of the commonly used Integrated Development Environments (IDE) for Python, which include Spyder and PyCharm. Here, we will also mention and briefly describe Jupyter notebooks, which are widely used for scientific applications of Python, and are an excellent tool for doing reproducible interactive work. We will cover Jupyter more extensively starting on Day 3. Also as part of this topic, we will describe how to use virtual environments and package installers such as pip  nd conda.

• Topic 3: Introduction to Python: Data Structures. We will begin our coverage of programming with Python by introducing its different data structures.and operations on data structures This will begin with the elementary data types such as integers, floats, Booleans, and strings, and the common operations that can be applied to these data types. We will then proceed to the so-called collection data structures, which primarily include lists, dictionaries, tuples, and sets.

• Topic 4: Introduction to Python: Programming. Having introduced Python’s data types, we will now turn to how to program in Python. We will begin with iteration, such as the for and while loops. We will then cover conditionals and functions.

Tuesday 5th – Classes from 09:30 to 17:30
• Topic 5: Modules, packages, and imports. Python is extended by hundreds of thousands of additional packages. Here, we will cover how to install and import these packages, and also how to write our own modules and packages.

• Topic 6: Numerical programming with numpy. Although not part of Python’s official standard library, the numpy package is the part of the de facto standard library for any scientific and numerical programming. Here we will introduce numpy, especially numpy arrays and their built in functions (i.e. “methods†).

• Topic 7: Data processing with pandas. The pandas library provides means to represent and manipulate data frames. Like numpy, pandas can be see as part of the de facto standard library for data oriented uses of Python.

• Topic 8: Object Oriented Programming. Python is an object oriented language and object oriented programming in Python is extensively used in anything beyond the very simplest types of programs. Moreover, compared to other languages, object oriented programming in Python is relatively easy to learn. Here, we provide a comprehensive introduction to object oriented programming in Python.

• Topic 9: Other Python programming features. In this section, we will cover some important features of Python not yet covered. These include exception handling, list and dictionary comprehensions, itertools, advanced collection types including defaultdict, anonymous functions, decorators, etc.

Wednesday 6th – Classes from 09:30 to 17:30
• Topic 10: Jupyter notebooks and Jupyterlab. Although we have already introduced Jupyter notebooks, here we will explore them properly. Jupyter notebooks are reproducible and interactive computing environment that support numerous programming languages, although Python remains the principal language used in Jupyter notebooks. Here, we’ll explore their major features and how they can be shared easily using GitHub and Binder.

• Topic 11: Data Visualization. Python provides many options for data visualization. The matplotlib library is a low level plotting library that allows for considerable control of the plot, albeit at the price of a considerable amount of low level code. Based on matplotlib, and providing a much higher level interface to the plot, is the seaborn library. This allows us to produce complex data visualizations with a minimal amount of code. Similar to seaborn is ggplot, which is a direct port of the widely used R based visualization library. In this section, we will also consider a set of other visualization libraries for Python. These include plotly, bokeh, and altair.

• Topic 12: Symbolic mathematics. Symbolic mathematics systems, also known as computer algebra systems, allow us to algebraically manipulate and solve symbolic mathematical expression. In Python, the principal symbolic mathematics library is sympy. This allows us simplify mathematical expressions, compute derivatives, integrals, and limits, solve equations, algebraically manipulate matrices, and more.

• Topic 13: Statistical data analysis. In this section, we will describe how to perform widely used statistical analysis in Python. Here we will start with the statsmodels package, which provides linear and generalized linear models as well as many other widely used statistical models. We will also introduce the scikit-learn package, which we will more widely use on Day 4, and use it for regression and classification analysis.

Thursday 7th – Classes from 09:30 to 17:30
• Topic 14: Machine learning. Python is arguably the most widely used language for machine learning. In this section, we will explore some of the major Python machine learning tools that are part of the scikit-learn package. This section continues our coverage of this package that began in Topic 12 on Day 3. Here, we will cover machine learning tools such as support vector machines, decision trees, random forests, k-means clustering, dimensionality reduction, model evaluation, and cross-validation.

• Topic 15: Neural networks and deep learning. A popular subfield of machine learning involves the use of artificial neural networks and deep learning methods. In this section, we will explore neural networks and deep learning using the keras library, which is a high level interface to neural network and deep learning libraries such as Tensorflow, Theano, or the Microsoft Cognitive Toolkit (CNTK). Examples that we will consider here include image classification and other classification problems taken from, for example, the UCI Machine Learning Repository.

Friday 8th – Classes from 09:30 to 16:00
• Topic 16: Bayesian models. Two probabilistic programming languages for Bayesian modelling in Python are PyMC3 and PyStan. PyMC3 is a Python native probabilistic programming language, while PyStan is the Python interface to the Stan programming language, which is also very widely used in R. Both PyMC3 and PyStan are extremely powerful tools and can implement arbitrary probabilistic models. Here, we will not have time to explore either in depth, but will be able to work through a number of nontrivial examples, which will illustrate the general feature and usage of both languages.

• Topic 17: High performance Python. The final topic that we will consider in this course is high performance computing with Python. While many of the tools that we considered above extremely quickly because they interface with compiled code written in C/C++ or Fortran, Python itself is a high level dynamically typed and interpreted programming language. As such, native Python code does not execute as fast as compiled languages such as C/C++ or Fortran. However, it is possible to achieve compiled language speeds in Python by compiling Python code. Here, we will consider Cython and Numba, both of which allow us achieve C/C++ speeds in Python with minimal extensions to our code. Also, in this section, we will consider parallelization in Python, in particular using IPyParallel and Dask, both of which allow easy parallel and distributed processing using Python.

Please email [log in to unmask] with any questions.

Other online courses

Python for data science, machine learning, and scientific computing (PDMS02)
4 May 2020 - 8 May 2020
https://www.prstatistics.com/course/python-for-data-science-machine-learning-and-scientific-computing-pdms02/

Introduction to analysis of ecological data using R (ISPE03)
18 May 2020 - 21 May 2020
https://www.prstatistics.com/course/introduction-to-spatial-analysis-of-ecological-data-using-r-ispe03/

Introduction to R for ecologists and evolutionary biologists (IRFB03)
20 May 2020 - 21 May 2020
https://www.prstatistics.com/course/introduction-to-r-for-ecologists-and-evolutionary-biologists-irfb03/

Generalised Linear (MIXED) (GLMM), Nonlinear (NLGLM) And General Additive Models (MIXED) (GAMM) (GNAM01)
25 May 2020 - 28 May 2020
https://www.prstatistics.com/course/generalised-linear-mixed-glmm-nonlinear-nlglm-and-general-additive-models-mixed-gamm-gnam02/

Stable Isotope Mixing Models using SIBER, SIAR, MixSIAR (SIMM06)
8 June 2020 - 11 June 2020
https://www.prstatistics.com/course/stable-isotope-mixing-models-using-r-simm06/

Reproducible Data Science using RMarkdown, Git, R packages, Docker, Make & Drake, and other tools (RDRP01)
29 June 2020 - 3 July 2020
https://www.prstatistics.com/course/reproducible-data-science-using-rmarkdown-git-r-packages-docker-make-drake-and-other-tools-rdrp01/

Applied Bayesian modelling for ecologists and epidemiologists (ABME06) 
20 July 2020 - 24 July 2020 
https://www.prstatistics.com/course/applied-bayesian-modelling-for-ecologists-and-epidemiologists-abme06/

Species Distribution Modeling using R (SDMR02)
27 July 2020 - 30 July 2020
https://www.prstatistics.com/course/species-distribution-modeling-using-r-sdmr02/

########################################################################

To unsubscribe from the BIODIVERSITY list, click the following link:
https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=BIODIVERSITY&A=1

Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

May 2024
April 2024
February 2024
January 2024
December 2023
November 2023
September 2023
August 2023
July 2023
June 2023
May 2023
April 2023
March 2023
December 2022
October 2022
August 2022
July 2022
June 2022
May 2022
March 2022
January 2022
November 2021
September 2021
August 2021
June 2021
May 2021
March 2021
January 2021
October 2020
September 2020
August 2020
July 2020
June 2020
May 2020
April 2020
March 2020
February 2020
January 2020
December 2019
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
April 2019
March 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
November 2017
October 2017
September 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
September 2015
August 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
June 2013
March 2013
February 2013
December 2012
October 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
April 2011
March 2011
February 2011
January 2011
November 2010
September 2010
June 2010
January 2010
September 2009
July 2009
April 2009
March 2009
November 2008
September 2008
August 2008
June 2008
April 2008
October 2007
August 2007
July 2007
June 2007
May 2007
February 2007
November 2006
October 2006
September 2006
August 2006
July 2006
May 2006
April 2006
February 2006
December 2005
November 2005
October 2005
September 2005
August 2005
July 2005
May 2005
April 2005
March 2005
February 2005
January 2005
December 2004
November 2004
October 2004
September 2004
April 2004
February 2004
January 2004
October 2003
September 2003
July 2003
May 2003
April 2003
February 2003
January 2003
December 2002
November 2002
October 2002
September 2002
August 2002
July 2002
June 2002
May 2002
April 2002
March 2002
January 2002
November 2001
October 2001
September 2001
August 2001
July 2001
June 2001
May 2001
April 2001
November 2000
October 2000
September 2000
August 2000
June 2000
May 2000
April 2000
March 2000


JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager