[HKML] Hong Kong Machine Learning Meetup Season 1 Episode 2

Wilson Fok - Processing Medical 3D Scans - An example of heart segmentation

Wilson presented us his approach on the 2018 Atrial Segmentation Challenge based on ensemble of convolutional neural networks. Here are the slides of his talk, and the GitHub for the heart segmentation code.

Abstract: Training an ensemble of convolutional neural networks requires much computational resources for a large set of high-resolution medical 3D scans because deep representation requires many parameters and layers. In this study, 100 3D late gadolinium-enhanced (LGE)-MRIs with a spatial resolution of 0.625 mm × 0.625 mm × 0.625 mm from patients with atrial fibrillation were utilized. To contain this training cost, down-sampling of images, transfer learning and ensemble of network’s past weights were deployed. This approach proposes an image processing stage using down-sampling and contrast limited adaptive histogram equalization, a network training stage using a cyclical learning rate schedule, and a testing stage using an ensemble. While this method achieves reasonable segmentation accuracy with the median of the Dice coefficients at 0.87, this method can be used on a computer with a GPU that has a Kepler architecture and at least 3GB memory.

By the way, he just finished his PhD thesis, and is looking for an interesting opportunity to apply his skills in AI. Do not hesitate to contact him!

Kris Methajarunon - A summary of ``Machine Learning and Finance: The New Empirical Asset Pricing’’ (SoFiE Summer School, Chicago)

Kris has presented us his takeaways from the SoFiE Summer School at University of Chicago: Machine Learning and Finance: The New Empirical Asset Pricing. He particulary focused his presentation on Empirical Asset Pricing via Machine Learning, a very recent paper (this version: July 21, 2018) exploring the use of different machine learning regressions on a given dataset of economic variables to predict future stock returns.

Personal opinion: To be noticed, despite being a recent paper, they have still a rather outdated view of neural networks being very general (universal approximators) non-linear regressors rather than useful representation builders, the latter being used efficiently by linear models. A big claim of the paper is that one can reach a Sharpe ratio of 2+ using these neural networks (once again only an old pyramidal architecture (1993) is tested) whereas linear models only achieve a Sharpe ratio of 0.5- using the same dataset. Bagging/Boosting trees methods lie somewhere in between. It’s hard to evaluate and reproduce such papers. People in the audience were doubtful and some results were against their own experience. I would like to see more code for such papers (a GitHub with some sample data). Also, since the point is only about the regression prowess of such non-linear models, why not testing them on synthetic stochastic time series, i.e. sampled from some model whose properties and expected results are known? At least, it would be reproducible, and would help to answer questions such: Do historical quarterly data really provide enough data points to fit a neural network?!

Gautier Marti (https://gmarti.gitlab.io/) - A review of two decades of correlations, hierarchies, networks and clustering in financial markets

I presented a review of some clustering and network analysis techniques applied to financial datasets, and their statistical limits. The bulk of the literature focuses on correlations between returns, yet these methods can also be useful for alternative datasets. Here are the slides. I also presented (not included in the previous slides) a way of combining several rankings provided by experts based on a `rank-dominance’ directed network that can be cast into a markov chain. The stationary distribution of the random walk provides the final combined ranking. I wrote a first draft about this approach there.