From the feedback I had over the last two weeks and my own feelings, force is to conclude that this year again the workshop has been an exceptional occasion to acquire and share information on computational tools available to neuroscientists.

I will summarize in one sentence so catch up your breath: in the morning we had a talk on normal distribution statistics and a second on support vector machines, the early afternoon block followed with a talk on circular statistics and one on non-linear system identification; finally the day ended with small scale neural population modeling and functional clustering and an excellent and comprehensive description of a non-parametric approach to non-linear system modeling using Volterra series. How did that sound to you?

One thing for sure each one of these talks covered a pertinent and actual topic, ranging from beginning to expert level, but this is not all of it!

Like last year, we are concerned in providing a timeless dimension to the workshop. Youtube providing us with a fantastic platform to reach a greater audience, each talks have been edited and posted online. Through this, anyone in the world will be free to benefit from the excellent work of our speakers. In addition, speakers were invited to publish code samples related to their talk, in order for anyone to quickly get started with their computational tool of choice. The listing below will guide you through these online supplements to the workshop experiment.

Before leaving it up to you to explore these online documents, let me take the time to thank everyone who have been involved in the workshop, first of all, the speakers without whom there wouldn’t be a workshop: Kelly Bullock, Sébastien Tremblay, Nour Malek, Adam Schneider, Richard Greg Stacey, Lennart Hilbert and Theodore Zanos. Second but not less importantly, these people have helped me during the workshop by taking notes, taking pictures, moving tables around, something that I couldn’t have done alone, a big thank you: Nathan Friedman, Matthew Krause, Nour Malek. And finally, students and others who have attended the workshop, we did it for you, and you presence encourages us in continuing, thank you for taking the time to join us in this project.

See you next year,

Frédéric Simard

Listing of the talks:

- Normal Distribution Statistics – Kelly Bullock

Contact: kelly.bullock@mail.mcgill.ca

This video presents the basis of normal distribution statistics: the standard normal distribution, tests for normality, skewness and kurtosis and unpaired and paired test on the mean.

- Support Vector Machines – Sébastien tremblay

Contact: sebastien.du.tremblay@mail.mcgill.ca

This video presents the support vector machines in the context of computational neuroscience. Used for population code analysis, it contains a comprehensive description of this algorithm and how to get started.

You can find a self-contained code sample here:

SVM_toolbox.zip

- Circular Statistics – Nour Malek and Frédéric Simard

Contact: nour.malek@mail.mcgill.ca, frederic.simard@atomsproducts.com

This video presents the basis of circular statistics and is framed around the toolbox of Philipp Berens. It contains the description of the Von Mises probability density distribution, parametric and non-parametric tests for uniformity, on the mean, on the median and others. All in the context of computational neuroscience.

You can find a self-contained code sample here:

CircStatsExamples.zip

- Non-linear system identification – Adam schneider

Contact: adam.schneider@mail.mcgill.ca

This video discuss linear and on-linear systems properties and identification. It covers non-linear system cascade modeling using reverse correlation or spike triggered average.

- Neuronal Modeling and Synchronicity – Greg Stacey

Contact: Richard.greg.stacey@gmail.com

A video about small-scale neural network modeling. It focuses on the identification of synchronicity in the neural population.

- Functional clustering – Lennart Hilbert

Contact: lennart.hilbert@gmail.com

This talk follows up Greg Stacey’s talk and covers the interesting topic of functional clustering, through hierarchical clustering, of synchronous neurons.

- Non-parametric Non-linear modeling with Volterra series – Dr. Theodoros Zanos

Contact: theozanos@gmail.com

A description of Volterra Series and how they can be used to model non-linear systems. It is a powerful and non-parametric approach that rely, in this version, on Laguerre expansion functions. An advanced topic made accessible by Theodore Zanos.

You can find a self-contained code sample here:

VFC distrib.zip

Good day everyone!

Over the course of the last few months, a team of McGill neuroscience students and post-docs have been preparing talks on the topic of computational neuroscience aimed at explaining computational techniques that they are applying to their data, the results of which will be presented at the Computational Neuroscience Workshop 2014.

Through the workshop, you will learn about the practical aspects of analysis tools and modeling techniques that are commonly used in the field of neuroscience, explained in layman terms and centered on practical considerations. The goal is to provide a refined description that cuts through the wood to keep the explanations quick and simple. It is an occasion to rapidly cover several approaches, before you decide to incorporate them in your study, if it has the potential to bring it to a better end.

This year, the event will take place on Tuesday, April 29th, on the 5th floor of the McIntyre Medical Building (rm. 521 and 504). It will start at 10h00am with the morning cookies and waking up coffee. The first two talks will take place in the morning (10h30-12h00) before moving on with the lunch break (12h00-1h00) which will feature social pizza. Students who have registered for lunch will have priority over the pizza, but the lunch is free of charge, see below for registration details. In the afternoon, 4 more talks will be presented (1h00-2h30 and 3h00-4h30), interleaved by a coffee break at 2h30 (see details schedule below).

This event is organized by us and is for us! Come numerous!

**Lunch registration**

For the lunch registration, you need to add your name to the doodle list targeted by this link.

Link to Doodle survey

Adding your name registers you for a meal, while the answer you provide determines whether you prefer a vegetarian meal or not. Answer Yes for the vegetarian option and No for the non-vegetarian option. The result of this pool will be used to plan for the correct amount of food.

**Detailed Schedule**

10h00 – 10h30

Welcoming cookie and coffee break

10h30-11h15

*Normal Distribution Statistics*

**Kelly Bullock**

11h15-12h00

*Support Vector Machines*

**Sébastien Tremblay**

https://www.dropbox.com/s/4vzkchzg96bbor6/SVM_toolbox.zip

12h00-1h00

Lunch Break

1h00-2h00

*Circular Statistics*

**Nour Malek**

1h45-2h30

*Non-Linear System Identification*

**Ashkan Golzar and Adam Schneider**

2h30-3h00

Coffee Break

3h00-3h45

*Network Modeling and Functional Clustering*

**Richard Greg Stacey and Lennart Hilbert**

3h45-4h30

*Volterra-Based Non-Linear Modeling*

**Theodore Zanos**

4h30-(…)

Social

]]>Finally! After several hours of video editing and cursing (ok, it wasn’t that bad), here are the videos of the talks presented during the computational neuroscience workshop, held on May 7th of this year.

Let me remind you that this workshop has been organized by us, students and for us, students. It is meant to be a means of combining our knowledge on methods for the analysis of neuronal signal. Among the topics covered we find: linear regression, optimization, classification, decoding, dimensionality reduction and information theory. Since the workshop, I have heard from many how much it opened their minds on the existing approach to computational neuroscience and/or clarified some concepts. I myself have made a great use of the talk on information theory, adding this aspect to my data analysis.

By putting the talks online, we can now all profit from the workshop presentations. If you are curious or need help regarding any of these topics, you can start your research here. Beyond only watching the talk, feel free to contact the speakers directly to get personalized advice on your specific problem. By sharing knowledge between ourselves, it is the whole community that benefits.

A very big thank to all those who have been involved in the organization of the event: Lennart Hilbert, Vladimir Grouza, Ananda Tay; the speakers and their collaborators: Greg Stacey, Nathan Friedman, Adam Schneider, Ashkan Golzar, Mohsen Jamali and Diego Mendoza. Without you, nothing would have been possible. And, not to forget, CAMBAM for providing funds to offer our participants a free lunch.

Below you will find the listing of the talks, accompanied by a description of their contents.

Frederic Simard,

On the behalf of the CAMBAM student community.

Listing of the talks:

- Linear Regression, Optimization, Classification and Decoding – Part 1, by Greg Stacey

Speaker: Greg Stacey

Contact: richard.greg.stacey@mail.mcgill.ca

This video is the first part of the series of two talks on the topic of regression, optimization, classification and decoding. In it you will the basics of linear regression, and explanation of the concept of a cost/loss function and of the gradient method for the optimization of such a function.

You can find a code sample here:

RegressionExamples.m

- Linear Regression, Optimization, Classification and Decoding – Part 2, by Nathan Friedman

Speaker: Nathan Friedman

Contact: nathan.friedman2@mail.mcgill.ca

In this talk, the second in the series on linear regression, optimization, classification and decoding, I give a very brief overview of some machine learning classification algorithms. I explain what a linear classifier is and demonstrate both binary and multi-class classifiers. The algorithms presented are: Regularized Least Squares, Logistic Regression, Perceptron, Support Vector Machine, and Fisher’s Discriminant.

- Dimensionality Reduction: PCA and Gauss. Proc. Factor Analysis, by Frederic Simard

Speaker: Frederic Simard

Contact: frederic.simard@mail.mcgill.ca

Website: www.atomsproducts.com

In this talk, you will learn the basics of dimensionality reduction. The first algorithm that is presented is the principal component analysis which is based on the variance in the data set. You will learn how to select a subset of dimensions while maintaining the most information about your data, as to, for example, make a classifier. A quick presentation of the Gaussian Process Factor Analysis follows. This algorithm extract trajectories of system state in lower dimension space.

You can find code sample packages here:

Principal Components Analysis Code Sample

Gaussian Process Factor Analysis Code Sample

- Information Theory and Neural Coding – Part 1, by Adam Schneider

Speaker: Adam Schneider

Contact: adam.schneider@mail.mcgill.ca

Collaborator: Mohsen Jamali

Information theory, developed by Claude Shannon in 1949, provides mathematically rigorous tools to quantify the precision with which a systems output contains information about its inputs, setting physical limits on a system’s capacity for information transmission. In this talk I present a brief summary of the fundamental concepts underlying information theory, in the context of its application to neuronal signal processing.

A useful, well documented, MATLAB toolbox for calculating coherence and mutual information in neural systems can be found at www.chronux.org.

- Information Theory and Neural Coding – Part 2, by Ashkan Golzar

Speaker: Ashkan Golzar

Contact: ashkan.golzar@mail.mcgill.ca

Collaborator: Mohsen Jamali

Following the fundamental concepts of information theory in the previous part, in the second part of this talk we present three methods to calculate mutual information between stimulus and neural signal: direct method, upper-bound method, and lower-bound method. We discuss advantages and disadvantages of each method as well as the assumptions inherent to each. We also show how information theory can address central questions in the field of neural coding.

]]>*Leon Glass, **Isadore Rosenfeld Chair in Cardiology and Professor of Physiology at McGill University and CAMBAM member, has recently been awarded the Arthur Winfree prize by the Society for Mathematical Biology:* “the Arthur T. Winfree Prize […] will honor a theoretician whose research has inspired significant new biology.” [smb.org]* On this occasion we (Thomas Quail and Lennart Hilbert) have posed a few questions to Leon. Again, a reminder that Leon is not only a brilliant theorist, but is hardly found short of experiences, insights, and worthwhile pastimes to talk about.*

*You were friends with Arthur Winfree for many years. How did **Winfree’s ideas influence your scientific trajectory?
*Art Winfree had an incredible geometric intuition into biological dynamics. One of his early papers described phase resetting of a simple model of a nonlinear oscillator in which the limit cycle had a circular path. Although others, including Poincaré had looked at similar models for oscillations, Art predicted that biological oscillations described by nonlinear equations should display topological differences in the phase resetting curves depending on the amplitude of the stimulus. This was a fascinating approach — going from a simple mathematical idea to generic predictions for experimental findings. The specific model also was a stimulus for thinking about the effects of periodic stimulation of biological oscillators and was one of the important factors that led to the experimental and theoretical work with Michael Guevara and Alvin Shrier on the entrainment of cardiac oscillations. A geometric approach to studying nonlinear dynamics has always seemed the natural way to proceed – Art’s pioneering work has been crucial to my thinking.

*How do you choose your topics/problems to work on?
*I like to choose problems that seem interesting to me and where there seems to be something basic that I do not understand. There should be the possibility of some mathematical analysis using tools that I understand or feel that I could understand with a bit of work. I also strongly favor problems where there is some local expertise in the biological aspects that would facilitate collaborative work involving both theory and experiments. Since lots of work now is done in a collaborative fashion with students, finding problems that are suitable for a particular student also plays a big role. I once heard Richard Feynman say that he chose problems to work on by optimizing the product: (importance of the problem) X (ability to solve the problem). That might be a bit too calculating for me, but it sounds like good advice to pass on.

Although I have sometimes, particularly when I was younger, not published work that would have been worth publishing, I rarely cut off projects. I have worked and continue to work in diverse areas – cardiac arrhythmias, genetic networks, visual perception. I am tenacious. I recently went back and worked on a problem related to the wagon wheel illusion – following up on work that had lain dormant for over 35 years but which was still interesting to me and worth pursuing.

*You’ve worked closely with experimentalists throughout your career. What are the key ingredients of a successful collaboration?*

Most important is having great respect for the knowledge and abilities of your experimental collaborators and finding someone who shares common interests. It also helps to realize that experimental findings will generally trump the theory in terms of the importance and interest. When carrying out research, I like to go into the laboratory when data is being collected and look at data carefully. This helps to focus on dynamics that may be the most interesting mathematically. Finally, experimentalists usually have way overcommitted the funds, so it helps to be willing to cover the costs of students and if possible to share in the costs of the experiments.

*How do you ensure depth in research while working with very diverse topics and using various methods?
*I do not worry about whether the problems I am studying are “deep”. However, I try to focus on questions in which there are interesting mathematical and physical problems that go beyond a

descriptive model of some phenomenon. I prefer analyses where there appear surprising emergent properties of the mathematics that were not anticipated at the initial formulation of the theoretical model. It helps if we find experimental evidence also for the unexpected dynamics. In some cases the unexpected experimental findings help set the agenda for the mathematical analyses.

*We know you are very interested in music, playing an instrument yourself. Any parallels or connections with science or mathematics that you would like to mention?
*I just enjoy the sound of the French horn and the challenge of trying to play it better. Although it would be nice if the study of complicated rhythms of the body improved my ability to play the French horn (or even to count in music), as far as I can tell these are occurring in separate regions of my brain and there is no carryover from one to the other. Both science and music are fun – and I have been privileged to have the opportunity to enjoy them both.

*How do you feel about receiving the 2013 Arthur T. Winfree Prize?*

I am deeply honored to receive this award. Art Winfree

was not only an extraordinary scientist, but he was also a colleague and close friend. His intense scientific curiosity and high personal integrity have been beacons in my own career. Since completing my PhD in Chemistry, I have identified with the Mathematical and Theoretical Biology communities, going back to early Gordon Conferences in the 1970s. I have also had the privilege of having been the President of the Society for Mathematical Biology. Mathematical Biology is still a young field, with only a few prizes – I am truly delighted to have been selected for this award.

Yes!

Several members of the CAMBAM student community, and professors associated with it, are studying in the field of neuroscience, but CAMBAM only rarely sponsored neuroscience events. It is time for this to change!

May 7th, 2013, in room 504 of the McIntyre Medical building, there will be a workshop on the theme of computational neuroscience. This workshop is for us, students, and us, students, only, a chance to learn, exchange, discuss and think about the mathematical tools available to our field, the usage of which is often grouped under the expression “Computational Neuroscience”.

Talks have been setup by students of McGill and several labs are represented. For this edition, several tools, mainly applicable to electrophysiology, will be presented, but that shouldn’t prevent anybody from coming as there will be time allowed for open discussion.

The day will go on like this:

12h00-13h00

Free lunch. We invite you to register if you plan on attending, this will help us plan the meals accordingly (add your name in the doodle survey and select Yes, if you desire a vegetarian meal and No, if you want a non-vegetarian meal).

Link to the doodle page: http://www.doodle.com/4i9r3ffezd6h8wda

Please do so by May 1st, as I will have to settle stuff with the traiteur.

13h00-13h45

*Introduction to regression and classification.*

**Nathan Friedman, Greg Stacey and Rubing Xu**

13h45-14h30

*Dimensionality Reduction: PCA and Gaussian Process Factor Analysis.*

**Frederic Simard**

14h30-15h00

Coffee Break

15h00-15h45

*Applications of Information Theory and Neural Coding.*

**Adam Schneider, Mohsen Jamali and Ashkan Golzar**

15h45-16h30

*Phase Coherence as a measure of synchrony within and between brain areas.*

**Diego Mendoza**

16h30-17h30 (or as long as the discussion go)

Mixer/Social

It will be a tremendous time, an event that you will bite your finger off for having missed, so please be there. Again, location and time:

Tuesday, May 7th 2013,

starting at noon

McIntyre Medical Building

3655 Sir William Olser

McGill University, Montréal

See you there!!!

]]>Another semester ended, another to be added the countless number of semesters that we have seen since the beginning of our academic career as professional students. Still, I’m taking the time to share with you what has made this last one particularly significant for me. I was registered in the course Machine Learning (COMP-652) and I had such a great time that I didn’t see the semester flying by, or was it because I was overwhelmed by work, it’s hard to tell. Never mind, this course taught me, and the other students who were registered, about a bunch of tools, not far from being qualified of statistics, that I’m sure you too could benefits from knowing. Unfortunately, there is only one way for you to get the fine details, and it is to register yourself, yet, I’m going to give you an appreciation of two topics covered during this class and maybe you’ll find yourself interested in learning more on the subject.

Machine learning refers to the ensemble of algorithms that aims at extracting information from a data set. This description is very vague and this is reflected in the eclectic set of algorithms that are studied thorough the course.

*Linear Regression*

Code available here.

The first concept taught is the one of regression. Whenever you parameterize some data, you are performing a form of regression. The most basic form of regression is linear regression (Fig. 1), which consist in mapping linearly, with more or less success, a set of input data, usually denoted *x*, to a set of output data, *t* as in target.

Linear regression gives the false impression that the fit will always be a linear function, which is true, but only in the relationship between the input data and the output data. Therefore, we usually use this equation to define linear regression:

Where the function ϕ(x_{i}) represent the input data x_{i}, an element of **x** a vector of input data, in the feature space. Let’s be honest, this notion of feature space kept me confused for a while, so I don’t expect you to fully grasp the concept from such a short description, but let me just push in another example to help you understand a little bit more of what I’m trying to explain. When basic linear regression doesn’t work, one might give it a try at polynomial regression (Fig. 2). Polynomial regression is still a linear regression, but we are now dealing with a linear combination of various degrees of *x*. Let see how we move on from the polynomial expression to the equation of linear regression. In a polynomial expression of degree n=3, this is the equation we get.

Notice that this is not a linear equation.

By concatenating all capital letters, the parameters of the model, into a vector **W** and defining the function:

The problem is then stated as:

Regression becomes a method to find the parameters **W** that best map the input on the output data. Often, we will try to minimize the mean squared error (MSE) between the data and the fit. In this case a closed solution exist and the weights minimizing the squared error can be computed with (I am skipping the details, but you can find an extensive description here):

But such an elegant resolution is not always available and when it is not, we will often use method such as gradient descent to find a satisfying solution to the error minimization problem.

Linear regression is presented because it represents de essence of machine learning. It is used to introduce concepts such as error measurement, problems associated with overfitting, cross-validation and the bias/variance trade off. Then, the Bayesian view to regression is introduced, this view is the foundation to several machine learning algorithm, such as logistic regression. Eventually, we move on to more complex algorithms that not only regress and extract the trend from a set of data but also perform classification.

*Principal component analysis (PCA)*

Code and dataset available here.

Several of you must already have heard of PCA. The main application of PCA is to perform dimensionality reduction, but as we will see, it can also be used to transform the input data representation space as to make categories linearly separable, even though they weren’t in their original reference frame.

Basically, PCA is a method that applies linear transformations on the set of data as to map it in a different frame of reference. Each orthogonal component (axis) of the new reference frame is computed as to collect as much of the variance of the data as possible. It is then possible to select a subset of the components, those that capture most of the variance, and drop the others, to obtain a dataset with a lower dimensionality that retain most of the information found in the data. (notice that this is a form of information compression)

To illustrate the PCA, I’m using a dataset I picked up at simafore. It is a dataset giving the nutrition facts for various brands of cereal. For the purpose of this example, I’m going to use 8 of the measurements reported (i.e. calories, protein, fat, sodium, fiber, carbohydrate, sugars and potassium) and the goal is to relate cereal brands that are similar to each other.

This might sound easy at first, after all, all high sugars content on one side and all low sugars content on the other! But that doesn’t take into account the content of fibers or protein… What if two different brands are very similar for their sugars content, but very different for on all other account? We are looking for a measurement of similarity that will take the 8 nutrition facts into account at the same time and if possible, report them in a frame of reference that can be visualized. Well, that’s precisely what PCA can do. (It might have occurred to some people that we could have normalized the values and computed the Euclidean distance between the cereal brands. Unfortunately I cannot explain the reasons here, but this becomes rapidly unpractical as the dimensionality of the dataset increase. See for a description of the curse of dimensionality.

Computing the PCA is a fairly simple operation (use Matlab: princomp, to let Matlab handle it). The first operation is to compute the covariance matrix. Let me remind you that each element of the NxN (N=8) covariance matrix is given by:

where E(x) is the expectation or the mean operator.

Then, we need to find the eigenvectors and eigenvalues of the covariance matrix. Eigenvectors and eigenvalues are a set of vectors and magnitude that respect the following equality:

Where A is the matrix, *v* is a eigenvector and *λ* a eigenvalue. It is a very important concept that finds application in many fields related to linear algebra, unfortunately, again, I cannot go in depth here, but I refer you to wiki! if you are interested in knowing more about eigenvectors and eigenvalues. For now, you can assume that the resulting set of eigenvectors indicates the direction in which the energy of the covariance matrix is oriented, that all eigenvectors are orthogonal (not always the case in general, but always true in PCA) and that the eigenvalues tell us how much of the energy goes in each directions.

The set of eigenvectors is our reference frame. There are as many eigenvectors as there are dimensions in our dataset, the thing is however, that by looking at the eigenvalues we can decide to drop dimensions that don’t contain a significant amount of energy. The following figure shows us the ordered magnitude of the eigenvalues obtained from our dataset and the cumulative energy as a function of the number of components used.

It is clear from this illustration that even though we are provided with 8 measurements, it is possible to represent the information in a 3 dimensional space, and this, without losing any information. Warning though, the 3 components aren’t the three most important measurements! each component is a combination of the 8 measurements, this means that even if we are using only three components, all measurements have a pondered contribution in the new frame of reference. The next figure now take advantage of the reduction of dimensionality to plot in a 2 dimensional space (we are still capturing more than 95% of the energy in the variance) the points corresponding to the various cereal brands. For the sake of the demonstration, I used another algorithm (K-means) taught in the Machine Learning course to form clusters to group the most similar cereal brands in 3 classes (red, green , blue) in the new frame of reference.

It is very apparent it this example that the PCA algorithm generates a reference frame in which it becomes much more easier to perform analysis, such as clustering, than it would be in a space of higher dimension. In this example it even makes is possible to represent the significant portion of information, hidden in a 8 dimensions space, in a 2 or 3 dimensions space. I won’t go any further in PCA, but from here it would be easy to determine which nutrition facts are contributing the most to each component, if there was a need to interpret the results or to build a classifier that classify any new cereal brand in one of the three classes.

I’m taking the opportunity to tell you that this is the most basic form of PCA, there is another form called kernel PCA that perform the crazy operation of projecting the data in a higher dimension space (greater than 8 ) before performing the same dimensionality reduction. It is then possible to extract relationships that aren’t readily accessible in the dataset. This is another, more complex and very interesting, topic covered in COMP-652.

*And the rest*

I have introduced linear regression, polynomial regression and principal component analysis, but this is only the tip of the iceberg. The machine learning course cover many other algorithms, some even more powerful, such as Bayesian decoder, support-vector machines (SVM), neural networks, Kernel machines, K-means/hierarchical and other clustering methods, Hidden Markov Models (HMM), Bayesian networks and this is not an exhaustive list.

The course also emphasize a deep understanding of the practical problems associated with machine learning. What makes a fit better than the other? Given a set of data and a goal to be attained how do we pick the right machine learning approach? And even if we have the data and the algorithm, how do we implement it?

The background required for the course: probability and statistics, linear algebra (+ matrices), calculus and Matlab. In every case, a background is recommended, but with hard work, you can overcome most of the problems, plan to put in a lot of time on this class.

Finally, I recommend this course to anyone who is interested in data mining, data analysis, artificial intelligence, modeling, algorithms and so on. Let me know if you have any questions, it will be a pleasure for me to answer them.

PS: I’m planning on organizing a workshop on the subject of data analysis in neuroscience, several machine learning tools will be covered, stay tuned!

]]>The first was a CAMBAM students’ roundtable lunch with Dr. Otto. A handful of us had pizza and soda with her while talking about our research projects and bouncing around related ideas. This was quite fun for me, since I’m not a CAMBAM member and don’t know much about what you folks do; it was great to hear about heart arrhythmias and neuron chemistry and asthma and actin and myosin and all the rest! I hope it was also fun for the CAMBAM folks to hear about my models of floral morphology and pollen dispersal and reproductive isolation. Mathematical biology contains such a diversity of ideas!

The second was Dr. Otto’s CAMBAM talk, “Inferring the past for traits that alter speciation and extinction rates”. The talk was about a method called BiSSE (“Binary-State Speciation and Extinction”) for estimating evolutionary parameters given information about extant species. For example, suppose you have a set of related species – a “clade”, like primates – and you know the value of some binary trait for all of these species – whether they live in trees or on the ground, for example. Suppose you also have a phylogenetic tree for your clade: based upon genetic sequences, probably, you have a tree that expresses which species are most closely related and long long ago their common ancestor lived, all the way back to the ancestor that the whole clade descends from. BiSSE (and related methods) will allow you to answer questions about the evolution of that clade and the way that it is related to the binary trait of interest. For example, what is the speciation rate and the extinction rate over the clades’ history, and do those rates differ significantly between lineages that live in trees and lineages that live on the ground? What is the rate of evolutionary transition from tree-living to ground-living, and vice-versa? And was the common ancestor of the clade tree-living (and thus ground-living was a later evolutionary innovation), or ground living (making tree-living the more recent development)?

The really neat thing about her talk, to me, was that she didn’t just present the method as “received wisdom”; she showed us exactly how the likelihood equations underlying it were derived, based upon an analysis of the ways that the binary states of lineages could change over infinitesimal time intervals. Once you understand how a model like BiSSE is constructed, you can construct your own models with additional constraints or additional parameters – traits that are not just binary, ancestral morphological states constrained by fossils, all sorts of possibilities. The power of this sort of method is immense – you can use information about extant species to tease apart what happened millions of years ago as those species evolved and diverged!

Today Dr. Otto has been giving a workshop on modeling in ecology and evolution; we’ve got more than 50 participants at that, so that’s a wonderful success. Tomorrow (Thursday December 5th), don’t forget that Dr. Otto will be giving a talk at the Redpath Museum’s auditorium, on “The Evolutionary Enigma of Sex”, at 3:00, with a wine and cheese social afterwards. I hope to see you there!

**References:**

*(the original paper on BiSSE:)*

Maddison, W., P. Midford, and S. Otto. 2007. Estimating a binary character’s effect on speciation and extinction. Systematic Biology 56:701–710. doi: 10.1080/10635150701607033

*(a paper on a later related method, BISSE-ness, that develops the primate example I used above, with fascinating results:)*

Magnuson-Ford, K., and S. Otto. Linking the Investigations of Character Evolution and Species Diversification. 2012. American Naturalist 180:225–245. doi: 10.1086/666649

]]>Hello,

I am writing this to share my experience of the 2012 q-Bio Conference, and to thank CAMBAM and its people for helping me get there.

After choosing a conference that had some of what I knew, and some of what I wanted get into, I utilized some handy conference preparation tips to try to make the most of it. The great weather, approachable people, and broad range of fascinating topics made for a very stimulating environment. I met my current postdoc advisor there. All in all, I would highly recommend this conference to researchers (at any level) interested in quantitative biology.

I am very grateful to my graduate advisors for their support, to members of CAMBAM and McGill’s Department of Physiology for their training and friendship through the years, and for the CAMBAM Travel Award that enabled this extremely fulfilling experience.

Regards,

Bart

*To find out more about Bartek’s new home institutions, you can visit:*

*http://sdcsb.org/* San Diego Centre for Systems Biology

* http://biocircuits.ucsd.edu/* Biocircuits Institute

* http://biodynamics.ucsd.edu/* Biodynamics Lab

“Lennart, what do you think we do an art show with CAMBAM?”

“But Grace, we are scientists.”

“Some great art has been done by simple minds.”

Lennart, clearly out of arguments: “Ehhhm, I guess we do an art show then?!”

So, the CAMBAM student chapter organized the 18th of August launch day of Empty Sets with/at The Plant, and in coordination with Eastern Bloc. The idea? Allow scientists and researchers in/around CAMBAM to express the aesthetic connected or emerging from their work vs. allow artists to express their take on science and modern technology. Oh, and let them do it in one place at the same time, of course, in the hope they will actually talk.

The result was fantastic! The Plant itself offers a surrealist environment, located north in the Mile End, the collective based in a large loft converted in an artistically efficient creation space. Since The Plant is used to opening their doors for experimental artistic projects and illustrious musical performances, it wasn’t much of a problem for them to provide the space required for this activity. Under the direction of Grace Brooks, CAMBAM student and part of The Plant collective, a line-up of impressive, intricate, and introspective works; a succession of musicians; and a populated party all come together. Throughout the day we saw artists playing scientists, scientists playing artists, and while it almost sounds like a cliché – we learned more form each other than we possibly expected, and it was a lot of fun. Here is an exposé of the work that our sciento-artists and artisto-scientists put together:

**Participating artists’/scientists’ links:**

Aaron McConomy on cargocollective

Fred Simard’s ‘In the mind of a coder’

Lennart Hilbert and Groj’s ‘Fabric of Experience’

Thanks to Carolyn Hance for supplying photo footage.

[Those missing in this list, please just contact Lennart and we will fix that -> About the Bloggers section]

Further Empty Sets events at Eastern Bloc are in the planning phase. Stay up-to-date for further Empty Sets events, as well as the CAMBAM student chapter: Join the CAMBAM facebook group, or the Student Chapter mailing list

]]>**Figure 1.*** Area under the receiver operating characteristic (ROC) curve.* **A**, Hypothetical example of two spike-count distributions from trials grouped by conditions X and Y. Spike counts range between 0 and the maximum value, c_{max}. **B**, The curved line, located above the dashed ‘chance’ line, represents the ROC curve that is constructed from the distributions in Panel A by classifying their values with the ideal observer (see Appendix). Classification performance is tested for every possible value of the classification criterion, c, which includes all possible spike counts between 0 and c_{max}. Thus, each value of c corresponds to a point in the ROC curve; the arrow shows how increasing values of c are mapped. The grey region is the area under the ROC curve. **C**, Behavioural sensitivity (or stimulus sensitivity) are defined as the area under the ROC curve that compares a distribution of failed-trial (or noise) spike counts (grey) versus a distribution of correct-trial (or signal) spike counts (open). The area under the ROC curve quantifies the difference between the two distributions.

Receiver operating characteristic (ROC) curves provide an unbiased, non-parametric way of quantifying the difference between any two distributions [for mathematic derivation, see 4] – in this case, the number of spikes fired by a neurone on one set of trials versus another. Indeed ROC analysis can be used to evaluate such diverse applications as medical imaging, materials testing, weather forecasting, information retrieval, polygraph lie detection, and aptitude testing [5], to name a few. Figure 1 illustrates how an ROC curve is used to quantify the difference between two distributions of spike counts (Figure 1A).

Faced with the problem of classifying a randomly sampled spike count as being from either distribution **X** (open) or **Y** (filled), the strategy adopted by the ideal observer is to choose a criterion level of *c* and assign any spike count less than *c* to **X**, and any spike count above *c* to **Y**. In other words, the decision rule used by the ideal observer is to assign spike count *s* (randomly drawn from either **X** or **Y**) to distribution **X** if *s* < *c*, or to distribution **Y** if *s* > *c*. All possible values of *c* between 0 and the maximum spike count (*c*_{max}) must be tested to find the optimal criterion that makes correct classifications the most often.

To do so, an ROC curve (Figure 1B) is built by plotting the probability that spike counts sampled from **X** are greater than *c* (false positives) against the probability that spike counts sampled from **Y** are greater than *c* (true positives). When *c* = 0, all spike counts are greater than *c*, thus the beginning point of the ROC curve is always (1,1). As *c* is increased (Figure 1B, arrow) the performance of the ideal observer using each criterion level is plotted. When* c* reaches *c*_{max}, no spike count is less than *c* and the end point of the curve is (0,0).

The area under the ROC curve (Figure 1B, grey shading) is the probability that the ideal observer will correctly classify any given spike count, randomly drawn from either distribution, and ranges between 0 and 1 accordingly; this probability is 0.75 for example distributions **X** and **Y** from Figure 1A. Therefore, when **X** and **Y** are completely distinct from each other, the ideal observer correctly classifies 100% of all spike counts (area = 1, Figure 1C, right). On the other hand, if there is no distinction between **X** and **Y**, then the ideal observer has a 50% chance of correct classification – a coin toss (area = 0.5, left). If **X** and **Y** from Figure 1A switch positions then the difference between them remains the same; this is reflected by the area under the ROC curve, which is an equal distance below 0.5 after the switch (0.25 = 0.5 – 0.25) as it was before (0.75 = 0.5 + 0.25).

**Example 1 – A neurone’s stimulus sensitivity**

The classic studies of Newsome and colleagues demonstrated the power of a careful comparison between neural activity and perceptual behaviour [1, 2]. Experiments were performed to carefully measure the discrimination sensitivity of MT neurones from monkey subjects performing a 2AFC motion-discrimination task. The subjects had to report whether the coherent motion in a patch of randomly moving dots was in the preferred or null (preferred + 180°) direction of an isolated MT neurone. It was critical to match the direction, speed, and location of dot motion to the neurone’s RF preferences. This ensured that the subject was responding to the same stimulus as the neurone. But more importantly: it maximised the chance that spikes recorded from the neurone were used by the subject to perform the task.

The direction of motion was randomly drawn on every trial so that the subject would have to watch the coherent dots carefully in order to make a correct choice. However, the strength of coherent motion was also varied from trial to trial by changing the percentage of dots that moved together. This varied the difficulty of the task and therefore the subject’s performance, which provided a frame of reference. The neurone’s ability to discriminate the direction of coherent dot motion at any one difficulty level could be directly compared against the performance of the subject.

A receiver operating characteristic (ROC) analysis (Figure 1) was used to quantify the discrimination sensitivity of MT neurones in the 2AFC task. For this, two distributions of spike counts were compared against each other, the distribution of counts from trials when the coherent motion was in the neurone’s preferred direction (distribution **Y** in Figure 1A) versus the distribution of counts from trials with coherent motion in the null direction (distribution **X** in Figure 1A). The resulting ROC areas (Figure 1B) described the probability that an ideal observer could tell which direction had been presented to the subject, based on the distribution of MT spike counts. This was computed separately for each level of coherent motion strength and compared directly against the subject’s performance. It was found that the average MT neurone could account for the subject’s discrimination sensitivity – at least under the particular conditions of the experiment [see 6].

**Example 2 – A neurone’s behavioural sensitivity**

The classic studies of Newsome and colleagues highlighted the large variation in the choices made by subjects and in the number of spikes fired by MT neurones. In response to statistically identical stimuli with low signal to noise ratios, subjects would sometimes report the wrong direction and their neurones would sometimes fire as if the opposite direction had been shown. However, this variation presented an exciting opportunity – because the ROC curve is a versatile tool and can be used to compare any two distributions of neural activity. Celebrini and Newsome [7] performed a ground-breaking analysis: they measured the correlation between the number of spikes fired by a neurone and the choice that the subject was about to make.

They began by grouping trials based on the ‘preferred’ or ‘null’ motion discrimination report made by the subject. Then they computed the ROC curve comparing the distribution of null-trial spike counts (distribution **X** in Figure 1A) versus the distribution of preferred-trial spike counts (distribution **Y** in Figure 1A). The area under this ROC curve is the probability that the ideal observer could correctly predict which direction of motion the subject would choose, using spike counts. This kind of ROC metric was named ‘choice probability’ when it was later used to analyse MT neurones [8]. However, we will refer to this ROC metric, and other like it, as ‘behavioural sensitivity’, because it measures how much the neural response predicts perceptual behaviour. It is important to keep in mind that behavioural sensitivity does not measure the correlation between spike counts and perception itself – only the perceptual report, which may not always be faithful to what was actually perceived.

Similar to stimulus sensitivity, a behavioural sensitivity of 0.5 shows that there was no difference in the number of spikes fired prior to either choice (Figure 1C, left). If more spikes were fired prior to choices coinciding with the neurone’s preferred direction, then behavioural sensitivity would rise towards 1, to indicate a positive correlation (Figure 1C, middle and right). On the other hand, if more spikes were fired prior to null direction choices, then behavioural sensitivity would sink towards 0, to indicate a negative correlation. On average, MT neurones had a weak but significant, positive correlation with the subject’s upcoming choice of motion direction [8].

Since then, behavioural sensitivities have been observed between MT spike counts and the subject’s upcoming choice when discriminating coherent dot motion direction [9-11], speed [12, 13], disparity [14, 15], and cylindrical rotation [16-18]. Similar behavioural sensitivities have been observed between a subject’s discrimination performance and spike counts from cortical areas V2 [19, 20] and MST [7, 21, 22], and even spike counts from somatosensory cortex [23, 24]. Using behavioural sensitivity, correlations have also been observed between MT spike counts and the subject’s ability to detect a change in coherent motion strength [6, 25] and speed [26], while similar behavioural sensitivities have been observed between a subject’s detection performance and spike counts from cortical areas V1 [27], V4 [28, 29], and VIP [6].

*** Acknowledgement**

This blog is adapted from a chapter in the open access book ‘Visual Cortex’ [30] in accordance with its Creative Commons Attribution 3.0 Unported licence.

**Bibliography**

1. Newsome, W.T., K.H. Britten, and J.A. Movshon, *Neuronal correlates of a perceptual decision.* Nature, 1989. **341**(6237): p. 52-4.

2. Britten, K.H., et al., *The Analysis of Visual-Motion – a Comparison of Neuronal and Psychophysical Performance.* Journal of Neuroscience, 1992. **12**(12): p. 4745-4765.

3. Kara, P., P. Reinagel, and R.C. Reid, *Low response variability in simultaneously recorded retinal, thalamic, and cortical neurons.* Neuron, 2000. **27**(3): p. 635-646.

4. Green, D.M. and J.A. Swets, *Signal detection theory and psychophysics*. 1966, New York: Wiley. 455.

5. Swets, J.A., *Measuring the accuracy of diagnostic systems.* Science, 1988. **240**(4857): p. 1285-93.

6. Cook, E.P. and J.H.R. Maunsell, *Dynamics of neuronal responses in macaque MT and VIP during motion detection.* Nature Neuroscience, 2002. **5**(10): p. 985-994.

7. Celebrini, S. and W.T. Newsome, *Neuronal and Psychophysical Sensitivity to Motion Signals in Extrastriate Area Mst of the Macaque Monkey.* Journal of Neuroscience, 1994. **14**(7): p. 4109-4124.

8. Britten, K.H., et al., *A relationship between behavioral choice and the visual responses of neurons in macaque MT.* Visual Neuroscience, 1996. **13**(1): p. 87-100.

9. Purushothaman, G. and D.C. Bradley, *Neural population code for fine perceptual decisions in area MT.* Nature Neuroscience, 2005. **8**(1): p. 99-106.

10. Cohen, M.R. and W.T. Newsome, *Estimates of the Contribution of Single Neurons to Perception Depend on Timescale and Noise Correlation.* Journal of Neuroscience, 2009. **29**(20): p. 6635-6648.

11. Law, C.T. and J.I. Gold, *Neural correlates of perceptual learning in a sensory-motor, but not a sensory, cortical area.* Nature Neuroscience, 2008. **11**(4): p. 505-513.

12. Liu, J. and W.T. Newsome, *Correlation between speed perception and neural activity in the middle temporal visual area.* J Neurosci, 2005. **25**(3): p. 711-22.

13. Price, N.S.C. and R.T. Born, *Timescales of Sensory- and Decision-Related Activity in the Middle Temporal and Medial Superior Temporal Areas.* Journal of Neuroscience, 2010. **30**(42): p. 14036-14045.

14. Uka, T. and G.C. DeAngelis, *Contribution of area MT to stereoscopic depth perception: Choice-related response modulations reflect task strategy.* Neuron, 2004. **42**(2): p. 297-310.

15. Sasaki, R. and T. Uka, *Dynamic Readout of Behaviorally Relevant Signals from Area MT during Task Switching.* Neuron, 2009. **62**(1): p. 147-157.

16. Dodd, J.V., et al., *Perceptually bistable three-dimensional figures evoke high choice probabilities in cortical area MT.* J Neurosci, 2001. **21**(13): p. 4809-21.

17. Parker, A.J., K. Krug, and B.G. Cumming, *Neuronal activity and its links with the perception of multi-stable figures.* Philosophical Transactions of the Royal Society of London Series B-Biological Sciences, 2002. **357**(1424): p. 1053-1062.

18. Krug, K., B.G. Cumming, and A.J. Parker, *Comparing perceptual signals of single V5/MT neurons in two binocular depth tasks.* Journal of Neurophysiology, 2004. **92**(3): p. 1586-1596.

19. Nienborg, H. and B.G. Cumming, *Macaque V2 neurons, but not V1 neurons, show choice-related activity.* Journal of Neuroscience, 2006. **26**(37): p. 9567-9578.

20. Nienborg, H. and B.G. Cumming, *Decision-related activity in sensory neurons reflects more than a neuron’s causal effect.* Nature, 2009. **459**(7243): p. 89-U93.

21. Gu, Y., D.E. Angelaki, and G.C. Deangelis, *Neural correlates of multisensory cue integration in macaque MSTd.* Nat Neurosci, 2008. **11**(10): p. 1201-10.

22. Gu, Y., et al., *Perceptual learning reduces interneuronal correlations in macaque visual cortex.* Neuron, 2011. **71**(4): p. 750-761.

23. de Lafuente, V. and R. Romo, *Neuronal correlates of subjective sensory experience.* Nature Neuroscience, 2005. **8**(12): p. 1698-1703.

24. Hernandez, A., et al., *Decoding a Perceptual Decision Process across Cortex.* Neuron, 2010. **66**(2): p. 300-314.

25. Bosking, W.H. and J.H. Maunsell, *Effects of stimulus direction on the correlation between behavior and single units in area MT during a motion detection task.* J Neurosci, 2011. **31**(22): p. 8230-8.

26. Herrington, T.M. and J.A. Assad, *Neural Activity in the Middle Temporal Area and Lateral Intraparietal Area during Endogenously Cued Shifts of Attention.* Journal of Neuroscience, 2009. **29**(45): p. 14160-14176.

27. Palmer, C., S.Y. Cheng, and E. Seidemann, *Linking neuronal and behavioral performance in a reaction-time visual detection task.* Journal of Neuroscience, 2007. **27**(30): p. 8122-8137.

28. Cohen, M.R. and J.H.R. Maunsell, *A Neuronal Population Measure of Attention Predicts Behavioral Performance on Individual Trials.* Journal of Neuroscience, 2010. **30**(45): p. 15241-15253.

29. Cohen, M. and J. Maunsell, *Using Neuronal Populations to Study the Mechanisms Underlying Spatial and Feature Attention.* Neuron, 2011. **70**(6): p. 1192-1204.

30. Smith, J.E.T., N.Y. Masse, and C.C. Zhan, E. P., *Linking neural activity to visual perception: separating sensory and attentional contributions*, in *Visual Cortex*, S. Molotchnikoff and J. Rouat, Editors. 2012, InTech.